Home

Awesome

Why linked open transit data?

This repo explains why we need linked open public transport data.

CC0-licensed chat on gitter

Note: This document is inspired by Publishing Transport Data for Maximum Reuse by Pieter Colpaert, which offers a very detailed view on the topic. We recommend to read it!

Problem

When travelling through larger regions or several countries by public transportation, finding out how and when to get to the destination is hard:

  1. People often need to use multiple, regionally limited apps to find out which trains/busses/ferries/etc. are available, because these apps often have imprecise (e.g. regarding accessibility), outdated (e.g. construction work) or just no data whatsoever about other regions. Doing this research across operator boundaries involves a lot of manual work. Essentially the user needs to do the job that computers should do: routing through sub-networks.
  2. When dealing with large distances (e.g. from Norway to France), this routing work becomes almost impossible for humans to do ad-hoc, because there are so many possible connections. Combined with e.g. cancellations & delays, users may never find the optimal connection because of that.
  3. Local, narrow-focused apps are not (as) accessible. They're often developed with a smaller budget, in some languages, without screen reader & offline support, have a bad UX, are only available for some platforms, etc.
  4. Apps built for current mainstream use cases are not future-proof. With the ongoing digitisation, diversification and increased on-demand features, they won't be able to deliver on people's mobility needs. (They barely do that right now.)

Data Hubs

An often-proposed (alleged) solution is to build data exchange hubs: They collect individual data sets (of both plan & realtime data), integrate them – often using hand-written matching tables and fuzzy matching – and emit one large data merged set. This doesn't work for the following reasons:

Linked Open Transport Data

Let's solve these problems by designing our public transportation systems from the start with federation, discovery of data sources and caching/offline compatibility in mind!

We must make our data

We must make our APIs

We must develop our data standards in the open (allowing barrier-free participation & collaboration), and make them freely licensed (to enable wide-spread use). They should cleanly separate semantics, replication/transport, storage & encoding. They should not reinvent the wheel, but rely on existing work (such as GTFS) where applicable.

Stable Identifiers

Because public transportation data reflects strongly interconnected public transportation systems, it has many links. When data by an author/source "A" refers to data from another author/source "B", it needs a reliable and precise way to identify items in "B" data. In federated systems, especially in linked data systems, the need for stable & globally unique IDs is even more significant than in traditional, centralized systems.

Note: The aforementioned Publishing Transport Data for Maximum Reuse has a specific section on stable identifiers for interoperable data.


Contributing

Contributions are welcome! If you have a question or want to propose changes, go to the Issues page. By participating in this project, you commit to the code of conduct.