Home

Awesome

transit-feed-quality-calculator Build Status

A project that uses the gtfs-realtime-validator to assess the quality of a large number of transit feeds.

This tool:

  1. Fetches the URLs for GTFS-realtime feeds and corresponding GTFS data from either the TransitFeeds.com GetFeeds API or a specified .csv file, and downloads them from each agency's server into a subdirectory
  2. Runs the gtfs-realtime-validator Batch Processor on each of the subdirectories
  3. Produces summary statistics and graphs, such as:

image

Read more in this Medium article.

Running the application

You'll need JDK 7 or higher.

This project was created in IntelliJ. You can also compile it from the command line using Maven.

If you're downloading GTFS or GTFS-rt from secure HTTPS URLs, you may need to install the Java Cryptography Extension (JCE). You will need to replace the US_export_policy.jar and local_policy.jar files in your JVM /security directory, such as C:\Program Files\Java\jdk1.8.0_73\jre\lib\security, with the JAR files in the JCE Extension download. Alternately, you can add -Djsse.enableSNIExtension=false to the command line when running the application.

To download feeds, you'll also need a TransitFeeds.com API key or a .csv file that includes feed information (see below).

Command line

  1. mvn package
  2. java -Djsse.enableSNIExtension=false -jar target/transit-feed-quality-calculator-1.0.0-SNAPSHOT.jar -directory output -transitFeedsApiKey 1234567689 -csv feeds.csv

Note that to download feeds, you'll need to provide an API key for TransitFeeds.com or a .csv file that includes feed information.

See the below command-line options section for a description.

IntelliJ

Run the Main.main() method, and provide the command-line options via the "Run configurations->Program arguments" feature.

Command line options

If you want to download feeds, either -transitFeedsApiKey or -csv parameters must be provided. If these are missing, this tool will proceed to validate and analyze the feeds currently in -directory without downloading any new files.

The feeds.csv file should be formatted as follows:

region_id,title,gtfs_url,gtfs_rt_url
"10000-Portland, OR, USA","TriMet Trip Update",https://developer.trimet.org/schedule/gtfs.zip,http://developer.trimet.org/ws/V1/TripUpdate&appID=225D5601E7729B9ED863DCA39
"10000-Portland, OR, USA","TriMet Alerts",https://developer.trimet.org/schedule/gtfs.zip,http://developer.trimet.org/ws/V1/FeedSpecAlerts&appID=225D5601E7729B9ED863DCA39
"20000-Oakland, CA, USA","AC Transit Trip Update",http://www.actransit.org/wp-content/uploads/GTFSWinter17B.zip,http://api.actransit.org/transit/gtfsrt/tripupdates?token=9A6257A021F944E7BE0AD32702DF23CE

Tips:

Sample output

You'll see a lot of folders within the output directory, one for each transit agency:

image

If you look in one of those folders, you'll see the following:

image

This contains the GTFS and GTFS-realtime source files downloaded from the agency:

  1. gtfs-zip - The GTFS data that was downloaded from the agency URL (HART, in this case) provided by TransitFeeds.com API
  2. HART Trip Updates-xxxx.pb - The TripUpdates binary Protocol Buffer file that was downloaded from the agency URL (HART, in this case) provided by TransitFeeds.com API, with the UTC time in milliseconds appended
  3. HART Vehicle Positions-xxxx.pb - The VehiclePositions binary Protocol Buffer file that was downloaded from the agency URL (HART, in this case) provided by TransitFeeds.com API, with the UTC time in milliseconds appended

...as well as plain text versions of the GTFS-realtime files generated by the gtfs-realtime-validator:

  1. HART Trip Updates-xxxx.pb.txt - The plain text version of the above TripUpdates binary
  2. HART Vehicle Positions-xxxx.pb.txt - The plain text version of the above VehiclePositions binary

...and the validation results for each GTFS-realtime file (see gtfs-realtime-validator Batch Processor output examples for details):

  1. HART Trip Updates-xxxx.results.json - The validation results for the above TripUpdates binary
  2. HART Vehicle Positions-xxxx.results.json - The validation results for the above VehiclePositions binary

An Excel spreadsheet file analysis-graphs.xlsx will be generated in the root folder of the project that contains graphs that summarize all of the analyzed GTFS-realtime feeds - for example:

image

The analysis results are also output to a JSON file, analysis-summary.json.

Implementation details

Take a look at the Main.main() method.

Here's a simplified version of what it looks like:

String directoryName = "your-directory";
String transitFeedsApiKey = "YOUR_TRANSIT_FEEDS.COM_API_HERE";
String csvFile = "feed-file.csv";

TransitFeedQualityCalculator calculator = new TransitFeedQualityCalculator(Paths.get(directoryName));
if (transitFeedsApiKey != null) {
    calculator.setTransitFeedsApiKey(transitFeedsApiKey);
}
if (csvFile != null) {
    calculator.setCsvDownloaderFile(csvFile);
}
calculator.calculate();

This demonstrates the usage of the TransitFeedQualityCalculator, which performs the following steps:

  1. Download - Via TransitFeedsDownloader and CsvDownloader
  2. Validate - Via BulkFeedValidator
  3. Analyze - Via ResultsAnalyzer
  4. Export - To Excel file via ExcelExporter to JSON file via Jackson

Dependencies

Managed via Maven: