Home

Awesome

Strategic Transport Planning Dataset

A graph based strategic transport planning dataset, aimed at creating the next generation of deep graph neural networks in transfer learning situations. Based on simulation results of the Four Step Model in PTV Visum.

Details of the work as well as results can be found in the thesis "Development of a Deep Learning Surrogate for the Four-Step Transportation Model".

<img src="https://raw.githubusercontent.com/nikita68/TransportPlanningDataset/main/1_classification_simple/q1_output_example_11_prediction.PNG" alt="drawing" width="650"/>

Example of prediction of a city's congestion level, using only socioeconomic and network information as input

State of the Art

DatasetModelMetricTest PerformancePublication
Classification Simple - Private TransportGCNIIAverage F10.87Makarov, 2021
Classification Simple - Public TransportGCNIIAverage F10.78Makarov, 2021
Classification Hard - Private TransportGCNIIMAE >=10118.5Makarov, 2021
Classification Hard - Public TransportGCNIIMAE >=10131.8Makarov, 2021
Regression - Private TransportGCNIIMAE >=10135.4*Makarov, 2021
Regression - Public TransportGCNIIMAE >=10141.2*Makarov, 2021

If you reach a better performance, please issue a pull request with the new values!

* Regression model used only augmented dataset for training data

"MAE >=10" is the mean absolute error, but only applied to test samples where the target has at least 10 units/h. This is done as the lower values can be predicted accurately and are less interesting for the domain. An implementation can be found in the examples and details in the thesis above.

Dataset Overview

DatasetExampleTraining SamplesValidation SamplesTest SamplesTaskPrimary MetricGraph SizesPrivate TransportPublic TransportInput & Output Transformed
1_classification_simpleExample63981600*2000*Classification - 3 classesAverage F115 - 80 original nodesYesYesYes
2_classification_hardExample163931600*2000*Classification - 51 classesMAE >=1015 - 80 original nodesYesYesYes
3_regressionExample163931600*2000*RegressionMAE >=1015 - 80 original nodesYesYesYes

* Identical validation and test datasets

All of the data is pickled and compressed with pbz2, with the datasets being ready to be used in PyTorch Geometric. Please see the examples on how to use the data.

Problem Setting

The aim of strategic transport planning is to do long term predictions of a given city, based purely on the underlying transport network and socioecenomic data. The socioeconomic data of every household is grouped into so called zones. To solve the problem the model needs to both understand how socioeconomic data creates demand as well as it is applied to the supply side of the network. In practise, this is usually solved by the 4 Step Model, however it has a number of downsides including requiring a large amount of manual work to calibrate and slow prediction speed.

This problem is a great task for deep graph neural networks. However, there is no large public dataset available, so we propose to generate data using in a surrogate model setup, using the 4 Step Model as the baseline. The problem is a transfer learning for GNNs. Within this project, the aim is to create a proof of concept, with all generated cities having between 15 and 80 nodes, and 3-10 zones. Details can be found in thesis.

Data Generation

Augmented Dataset

Synthetic Dataset

The dataset 1_classification_simple uses exclusively the augmented dataset. Both 2_classification_hard and 3_regression use the augmented dataset for training, validation and test sets, but also add the synthetic dataset for additional training samples. All problems use identical validation and test datasets, with the targets transformed to the respective task.

Data Transformation

All details can be found in the thesis above.

Open Challenges for GNNs

If you need inspiration for what to focus on to improve GNNs, here are some open challenges with details in the thesis above:

Issues & Dataset Requests

If you find any issues with the data or want to get a specific version of the data, please raise an issue over at the top.

Citing

If you create any new work based on this dataset, please cite:

@article{Narayanan.2024,
    author = {Narayanan, Santhanakrishnan and Makarov, Nikita and Antoniou, Constantinos},
    title = {Graph neural networks as strategic transport modelling alternative - A proof of concept for a surrogate},
    year = 2024,
    journal = {IET Intelligent Transport Systems},
    volume = {Under revision},
}

@article{Makarov.2024,
    author = {Makarov, Nikita and Narayanan, Santhanakrishnan and Antoniou, Constantinos},
    title = {Graph neural network surrogate for strategic transport planning},
    year = 2024,
    journal = {Under review}
}

@misc{makarov2021,
    author = {Makarov, Nikita and Narayanan, Santhanakrishnan and Antoniou, Constantinos},
    institution = {Transportation Systems Engineering},
    school = {Technical University of Munich},
    title = {Development of a Deep Learning Surrogate for the Four-Step Transportation Model},
    year = 2021,
    url = {https://github.com/nikita68/TransportPlanningDataset}
}


@mastersthesis{makarov2021,
    author = {Makarov, Nikita},
    institution = {Transportation Systems Engineering},
    school = {Technical University of Munich},
    title = {Development of a Deep Learning Surrogate for the Four-Step Transportation Model},
    year = 2021,
    url = {https://mediatum.ub.tum.de/doc/1638691/dwz10x0l0w38xdklv9zkrprqs.pdf}
}

License & Acknowledgements