Home

Awesome

Validate GTFS files

The General Transit Feed Specification (GTFS) defines a common format for public transportation schedules and associated geographic information.

This project is a validating tool for such files and can perform checks ranging from simple ones (the archive is not valid, a file is missing) to more complex ones (a vehicle is moving too fast).

Online tool

transport-validator is the tool used by the French National Access Point to validate GTFS files. If you want to use it online, you can validate your own files at this address.

Validation output

Validation output is twofold:

The output is by default formatted in json, but yaml is also available. See Options for more information.

{
    "metadata": {
        ...
    },
    "validations": {
       ...
    }
}

Metadata

Give useful information about the validated file content:

EntryformatDescription
start_date"YYYY-MM-DD"The starting date of the calendar information (both calendar.txt and calendar_dates.txt are taken into account).
end_date"YYYY-MM-DD"The ending date of the calendar information (both calendar.txt and calendar_dates.txt are taken into account).
networks_start_end_datesmapGives the starting and ending dates of the calendar information for each network. For example: {"agency name 1":{"start_date":"2022-08-18","end_date":"2022-10-23"}, "agency name 2":{"start_date":"2020-08-18","end_date":"2023-10-23"}}
networkslist of stringsA list of unique agencies names, found in agency.txt
modeslist of stringsA list of the route_types found in routes.txt
issues_countObjectA summary of the validation issues found in the validations section. Keys of the object are the issue name, values are the number of corresponding issues found.
has_faresbooleanTrue if a fare_attributes.txt file exists and contains information
has_shapesbooleanTrue if a shapes.txt file exists and contains information
has_pathwaysbooleanTrue if a pathways.txt file exists and contains information
some_stops_need_phone_agencybooleanSome stops have a continuous_pickup or a continuous_drop_off field equal to 2.
some_stops_need_phone_driverbooleanSome stops have a continuous_pickup or a continuous_drop_off field equal to 3.
validator_versionstringThe validator version as found in the Cargo.toml

There is also a stats object inside with various statistics about the data:

EntryformatDescription
stops_countintegerNumber of stops found in the file stops.txt (for any location_type)
stop_areas_countintegerNumber of stop areas (location_type equal to 1) found in the file stops.txt.
stop_points_countintegerNumber of stops (location_type equal to 0) found in the file stops.txt.
stops_with_wheelchair_info_countinteger or nullNumber of stops (with any location_type) with wheelchair_boarding information.
lines_countintegerNumber of routes found in routes.txt
routes_with_custom_color_countintegerNumber of routes with a custom color
routes_with_short_name_countintegerNumber of routes with a short name
routes_with_long_name_countintegerNumber of routes with a long name
trips_countintegerNumber of trips found in trips.txt
trips_with_bike_info_countintegerNumber of trips found in trips.txt with bike information provided (bikes_allowed equal to 1 or 2)
trips_with_wheelchair_info_countintegerNumber of trips found in trips.txt with wheelchair information provided (wheelchair_accessible equal to 1 or 2)
trips_with_shape_countintegerNumber of trips found in trips.txt with an attached shape
trips_with_trip_headsign_countintegerNumber of trips with a headsign
transfers_countintegerNumber of transfers
fares_attribute_countintegerNumber of fares attributes
fares_rules_countintegerNumber of fares rules

Note: For the stops_with_wheelchair_info_count, the information can be specified at the stop level (wheelchair_boarding equal to 1 or 2), or inherited from its parent station. Can be null if the GTFS contains errors preventing to compute this field.

Example

    "metadata": {
        "start_date": "2020-11-02",
        "end_date": "2022-01-31",
        "networks": [
            "carSud"
        ],
        "modes": [
            "bus", "tramway"
        ],
        "issues_count": {
            "ExcessiveSpeed": 5,
            "CloseStops": 10,
            "NullDuration": 10,
            "MissingName": 1,
            "MissingCoordinates": 1215,
            "InvalidCoordinates": 1215,
            "DuplicateStops": 49,
            "IdNotAscii": 171
        },
        "stats": {
            "stops_count": 17,
            "stop_areas_count": 2,
            "stop_points_count": 9,
            "stops_with_wheelchair_info_count": null,
            "lines_count": 5,
            "routes_with_custom_color_count": 0,
            "routes_with_short_name_count": 0,
            "routes_with_long_name_count": 4,
            "trips_count": 11,
            "trips_with_bike_info_count": 3,
            "trips_with_wheelchair_info_count": 3,
            "trips_with_shape_count": 0,
            "trips_with_trip_headsign_count": 9,
            "transfers_count": 0,
            "fares_attribute_count": 2,
            "fares_rules_count": 4
        },
        "has_fares": true,
        "has_shapes": true,
        "has_pathways": false,
        "some_stops_need_phone_agency": false,
        "some_stops_need_phone_driver": false
    }

Validations

The "validations" key contains the actual validation results.

Severity

Each check is associated with a severity level.

SeverityDescription
FatalCritical error, the GTFS archive couldn't be opened
ErrorThe file does not respect the GTFS specification
WarningNot a specification error, but something is most likely wrong in the data
InformationSimple information

List of checks

The validator performs a number of checks. The list of checks can be seen in the file issues.rs.

Here is a human friendly list of them :

check nameSeverityDescription
UnusedStopInformationA stop is not used.
SlowInformationThe speed between two stops is too low.
ExcessiveSpeedInformationThe speed between two stops is too high.
CloseStopsInformationTwo stops very close to each other in the same trips
InvalidRouteTypeInformationThe type of a route is not valid.
DuplicateStopsInformationTwo stop points or stop areas look identical. They share the same name, and are geographically very close. This check is not applied to station entrances (location_type equal to 2)
DuplicateStopSequenceErrorSeveral stop times in a trip have the same stop_sequence value. The stop_sequence values within a trip must be unique.
ExtraFileInformationThe file does not belong to a GTFS archive
UnusedShapeIdInformationA shape_id defined in shapes.txt is not used elsewhere in the GTFS
NegativeTravelTimeWarningThe travel duration between two stops is negative.
MissingNameWarningAn agency, a route or a stop has its name missing.
MissingCoordinatesWarningA shape point or a stop is missing its coordinate(s).
NullDurationWarningThe travel duration between two stops is null.
MissingLanguageWarningThe publisher language code is missing.
InvalidLanguageWarningThe publisher language code is not valid.
DuplicateObjectIdErrorThe object has at least one object with the same ID.
InvalidStopLocationTypeInTripWarningOnly Stop Points are allowed to be used in a Trip
InvalidStopParentWarningThe parent station of this stop is not a valid one
IdNotAsciiWarningThe identifier is not only ASCII characters
MissingIdErrorAn agency, a calendar, a route, a shape point, a stop or a trip has its Id missing.
MissingUrlErrorAn agency or a feed publisher is missing its URL.
InvalidUrlErrorThe URL of an agency or a feed publisher is not valid.
InvalidCoordinatesErrorThe coordinates of a shape point or a stop are not valid.
InvalidTimezoneErrorThe TimeZone of an agency is not valid.
MissingPriceErrorA fare is missing its price.
InvalidCurrencyErrorThe currency of a fare is not valid
InvalidTransfersErrorThe number of transfers of a fare is not valid.
InvalidTransferDurationErrorThe transfer duration of a fare is not valid.
ImpossibleToInterpolateStopTimesErrorIt's impossible to interpolate the departure/arrival of some stoptimes of the trip
InvalidShapeIdErrorA shape_id referenced in trips.txt does not exist in shapes.txt
InvalidReferenceFatalReference not valid. For example a stop referenced by a stop time that does not exist
InvalidArchiveFatal.zip Archive not valid.
UnloadableModelFatalA fatal error has occured by building the links in the model
MissingMandatoryFileFatalMandatory file missing
SubFolderErrorFiles were in a subfolder, which is explicitly forbidden by the specification

Geojson information

When relevant for the check, geojson information is added for each check output, making the GTFS debug process easier.

Example

Here is a validation output containing one warning, triggered by a non Ascii Stop id:

"validations": {
    "IdNotAscii" : [
        {
            "severity": "Warning",
            "issue_type": "IdNotAscii",
            "object_id": "AllBél",
            "object_type": "Stop",
            "object_name": "",
            "related_objects": [],
            "geojson": {
                "features": [
                    {
                        "geometry": null,
                        "properties": {
                            "id": "AllBél",
                            "name": ""
                        },
                        "type": "Feature"
                    }
                ],
                "type": "FeatureCollection"
            }
        }
    ]
}

Another example showing the GeoJSON information for an information about two stops too close:

    "validations": {
        "CloseStops": [
            {
                "severity": "Information",
                "issue_type": "CloseStops",
                "object_id": "PH00320P",
                "object_type": "Stop",
                "object_name": "Baril Les Hauts",
                "related_objects": [
                    {
                        "id": "PH00320C",
                        "object_type": "Stop",
                        "name": "Baril Les Hauts"
                    },
                    {
                        "id": "MAGM",
                        "object_type": "Route",
                        "name": "MAGM-MagmaBus Navette Centre Ville St Philippe"
                    }
                ],
                "details": "distance between the stops is 0 meter(s)",
                "geojson": {
                    "features": [
                        {
                            "geometry": {
                                "coordinates": [
                                    55.71866572404381,
                                    -21.356751531407003
                                ],
                                "type": "Point"
                            },
                            "properties": {
                                "id": "PH00320P",
                                "name": "Baril Les Hauts"
                            },
                            "type": "Feature"
                        },
                        {
                            "geometry": {
                                "coordinates": [
                                    55.71866572404381,
                                    -21.356751531407003
                                ],
                                "type": "Point"
                            },
                            "properties": {
                                "id": "PH00320C",
                                "name": "Baril Les Hauts"
                            },
                            "type": "Feature"
                        },
                        {
                            "geometry": null,
                            "properties": {
                                "details": "distance between the stops is 0 meter(s)"
                            },
                            "type": "Feature"
                        }
                    ],
                    "type": "FeatureCollection"
                }
            }
        ]
    }

Installation

  1. This project is written in Rust. You need to first install Rust on your machine.

  2. Clone the project:

git clone https://github.com/etalab/transport-validator/
cd transport-validator

Run the validator

Run from a local directory

The release version can be run as:

cargo run --release -- --input test_data/unused_stop

It is the fastest and recommanded version to use when validating files.

If you are developping on the validator, you can also run a debug build:

cargo run -- --input test_data/unused_stop

The validator can read a zip file, or an url:

cargo run --release -- -i some_gtfs.zip
cargo run --release -- -i https://example.com/network.gfts

If you do not intend to run the validator as a dæmon, it can be compiled without dæmon support, saving on compile time and binary size:

cargo run --release --no-default-features -- -i some_gtfs.zip

Run as a dæmon

The validator can run as a HTTP dæmon to validate any file from a url.

For now the call is synchronous. Be aware that if the file is large, the time required to download the GTFS zip, the request might time out.

The command to launch the dæmon is:

cargo run --release

You can then ask for a validation:

curl http://localhost:7878/validate?url=https://example.com/gtfs.zip

Options

Custom rules

Some values used during the validations can be customized by using the --custom-rules option and providing a path to a YAML file.

Available customizations are:

FieldDescription
max_tramway_speedMaximum speed in km/h on a route_type tramway before triggering an ExcessiveSpeed warning
max_subway_speedMaximum speed in km/h on a route_type subway before triggering an ExcessiveSpeed warning
max_rail_speedMaximum speed in km/h on a route_type rail before triggering an ExcessiveSpeed warning
max_bus_speedMaximum speed in km/h on a route_type bus before triggering an ExcessiveSpeed warning
max_ferry_speedMaximum speed in km/h on a route_type ferry before triggering an ExcessiveSpeed warning
max_cable_car_speedMaximum speed in km/h on a route_type cable car before triggering an ExcessiveSpeed warning
max_gondola_speedMaximum speed in km/h on a route_type gondola before triggering an ExcessiveSpeed warning
max_funicular_speedMaximum speed in km/h on a route_type funicular before triggering an ExcessiveSpeed warning
max_coach_speedMaximum speed in km/h on a route_type coach before triggering an ExcessiveSpeed warning
max_air_speedMaximum speed in km/h on a route_type air before triggering an ExcessiveSpeed warning
max_taxi_speedMaximum speed in km/h on a route_type taxi before triggering an ExcessiveSpeed warning
max_other_speedMaximum speed in km/h on a route_type other before triggering an ExcessiveSpeed warning

Example

max_bus_speed: 120
max_gondola_speed: 50

If you need to customize other values, please let us know.

Lint

To lint our code we use rustfmt

Install it running:

rustup component add rustfmt-preview

Lint your code running:

cargo fmt --all -- --write-mode=diff

Contact

Questions? Comments? Get in touch with the transport.data.gouv.fr's technical team: tech@transport.beta.gouv.fr

Alternatives