Awesome
BDIT Data Sources <!-- omit in toc -->
This is the primary repository for code and documentation for most the data sources the Data & Analytics Unit uses.
Each folder is for a different data source (or category of related data sources). They contain:
- an explanation of what the data source is,
- how it can be used, and
- the Python and SQL necessary for our Extract, Load, Transform, and Validate processes into our PostgreSQL database.
For those curious about what data we manage is released on OpenData, see the Open Data Releases.
Table of Contents <!-- omit in toc -->
- Airflow DAGS
- Bluetooth Detectors
- Collisions
- Cycling App (inactive)
- Events (inactive)
- GIS - Geographic Data
- HERE Travel Time Data
- Incidents (inactive)
- INRIX (inactive)
- Parking (inactive)
- Road Closure (inactive)
- TTC (inactive)
- Volume Data
- Watch Your Speed signs
- Weather
- Open Data Releases
Airflow DAGS
This folder contains the DAG Python files for our Airflow orchestration that dictate the logic and schedule for data pipeline tasks.
Bluetooth Detectors
The City collects traffic data from strategically placed sensors at intersections and along highways. These detect Bluetooth MAC addresses of vehicles as they drive by, which are immediately anonymized. When a MAC address is detected at two sensors, the travel time between the two sensors is calculated.
Collisions
The collisions dataset consists of data on individuals involved in traffic collisions from approximately 1985 to the present day (though there are some historical collisions from even earlier included).
Cycling App (inactive)
The Cycling App collected OD and trip data until 2016.
Events (inactive)
How do special events impact traffic in the city? Data sources include the City's Open Data and TicketMaster.
GIS - Geographic Data
Assets
The assets
directory stores airflow processes related to various assets that we help manage, such as datasets related to Vision Zero. Below are the assets that we have automated so far.
Red Light Cameras
Red Light Camera data are obtained from Open Data and are also indicators that are displayed on the Vision Zero Map and Dashboard. We have developed a process using Airflow to automatically connect to Open Data and store the data to our RDS Postgres database. See the README file in assets/rlc
for details about this process.
Traffic Signals
A number of different features of traffic signals (Leading Pedestrian Intervals, Audible Pedestrian Signals, Pedestrian Crossovers, Traffic Signals) are periodically pulled from OpenData . These indicators are used to populate the Vision Zero Map and Dashboard. See the README file in assets/traffic_signals
for details about the source datasets and how they are combined into a final table made up of the following data elements.
School Safety Zones
This dataset comes from Vision Zero which uses Google Sheets to track progress on the implementation of safety improvements in school zones.
Street Centreline Geocoding
Contains SQL used to transform text description of street (in bylaws) into centreline geometries.
HERE Travel Time Data
Travel time data provided by HERE Technologies from a mix of vehicle probes. Daily extracts of 5-min aggregated speed data for each link in the city (where data are available).
Incidents (inactive)
See CityofToronto/bdit_incidents
INRIX (inactive)
Data collected from a variety of traffic probes from 2007 to 2016 for major streets and arterials.
Parking (inactive)
This contains R and SQL files for pulling parking lots and parking tickets from Open Data. They might be useful but haven't been documented or automated.
Road Closure (inactive)
This directory contains a Python file to pull and parse the XML feed of road closures. This process hasn't been automated (and more recent versions of the API use JSON).
TTC (inactive)
This contains some valiant attempts at transforming CIS vehicle location data provided to us by the TTC on streetcar locations as well as an automated process for pulling in GTFS schedule data.
Volume Data
Miovision - Multi-modal Permanent Video Counters
Miovision currently provides volume counts gathered by cameras installed at specific intersections. There are 32 intersections in total. Miovision then processes the video footage and provides volume counts in aggregated 1 minute bins. Data stored in 1min bin (TMC) is available in miovision_api.volumes
whereas data stored in 15min bin for TMC is available in miovision_api.volumes_15min_tmc
and data stored in 15min for ATR is available in miovision_api.volumes_15min
.
RESCU - Loop Detectors (inactive)
Deprecated. See Vehicle Detector Station (VDS).
Short-term Counting Program
volumes/short_term_counting_program/
Short-term traffic counts are conducted on an ad-hoc basis as the need arises, and may be done throughout the year both at intersections and mid-block. Much of this dataset is also available through the internal application MOVE and data go as far back as 1994.
Vehicle Detector Station (VDS)
The city operates various permanent Vehicle Detector Stations (VDS), employing different technologies, including RESCU, intersection detectors, Blue City and Smartmicro. The most frequently used for D&A context is the RESCU network which tracks traffic volumes on Toronto expressways, about which more information can be found on the city's website or here.
Watch Your Speed signs
The city has installed Watch Your Speed signs that display the speed a vehicle is travelling at and flashes if the vehicle is travelling over the speed limit. Installation of the sign was done as part of 2 programs: the mobile watch your speed which has signs mounted on existing poles, moved every few weeks, and school watch your speed which has signs installed at high priority schools. The signs also collect continuous speed data.
Weather
Daily historical weather conditions and predictions from Environment Canada.
Open Data Releases
- Travel Times - Bluetooth contains data for all the bluetooth segments collected by the city. The travel times are 5 minute average travel times. The real-time feed is currently not operational. See the Bluetooth README for more info.
- Watch Your Speed Signs give feedback to drivers to encourage them to slow down, they also record speed of vehicles passing by the sign. Semi-aggregated and monthly summary data are available for the two programs (Stationary School Safety Zone signs and Mobile Signs) and are updated monthly. see the WYS README for links to these datasets
For the King St. Transit Pilot, the team has released the following datasets, which are typically a subset of larger datasets specific to the pilot:
- King St. Transit Pilot - Detailed Bluetooth Travel Time contains travel times collected during the King Street Pilot in the same format as the above data set. Data is collected on segments found in the King St. Transit Pilot – Bluetooth Travel Time Segments map layer. See the Bluetooth README for more info.
- King St. Transit Pilot – Bluetooth Travel Time Summary contains monthly averages of corridor-level travel times by time periods. See the Bluetooth README for more info.
- King St. Transit Pilot - 2015 King Street Traffic Counts contains 15 minute aggregated ATR data collected during 2015 of various locations on King Street. See the Volumes Open Data King Street Pilot section for more info.
- King St. Transit Pilot – Detailed Traffic & Pedestrian Volumes contains 15 minute aggregated TMC data collected from Miovision cameras during the King Street Pilot. The counts occurred at 31-32 locations at or around the King Street Pilot Area. See the Miovision Open Data section for more info.
- King St. Transit Pilot - Traffic & Pedestrian Volumes Summary is a monthly summary of the above data, only including peak period and east-west data. The data in this dataset goes into the King Street Pilot Dashboard. See the Miovision Open Data section for more info.