Home

Awesome

<!-- markdownlint-disable-next-line -->

<img src="docs/tri-logo.png" width="40%">

Dataset Governance Policy (DGP)

build-docker license open-issues coverage badge docs

To ensure the traceability, reproducibility and standardization for all ML datasets and models generated and consumed within Toyota Research Institute (TRI), we developed the Dataset-Governance-Policy (DGP) that codifies the schema and maintenance of all TRI's Autonomous Vehicle (AV) datasets.

<p align="center"> <img src="docs/3d-viz-proj.gif" alt="3d-viz-proj"/> </p>

Components

Getting Started

Please see Getting Started for environment setup.

Getting started is as simple as initializing a dataset-class with the relevant dataset JSON, raw data sensor names, annotation types, and split information. Below, we show a few examples of initializing a Pytorch dataset for multi-modal learning from 2D bounding boxes, and 3D bounding boxes.

from dgp.datasets import SynchronizedSceneDataset

# Load synchronized pairs of camera and lidar frames, with 2d and 3d
# bounding box annotations.
dataset = SynchronizedSceneDataset('<dataset_name>_v0.0.json',
    datum_names=('camera_01', 'lidar'),
    requested_annotations=('bounding_box_2d', 'bounding_box_3d'),
    split='train')

Examples

A list of starter scripts are provided in the examples directory.

Build and run tests

You can build the base docker image and run the tests within docker container via:

make docker-build
make docker-run-tests

Contributing

We appreciate all contributions to DGP! To learn more about making a contribution to DGP, please see Contribution Guidelines.

CI Ecosystem

JobCINotes
docker-buildBuild StatusDocker build and push to container registry
pre-mergeBuild StatusPre-merge testing
doc-genBuild StatusGitHub Pages doc generation
coverageBuild StatusCode coverage metrics and badge generation

💬 Where to file bug reports

TypePlatforms
🚨 Bug ReportsGitHub Issue Tracker
🎁 Feature RequestsGitHub Issue Tracker

👩‍💻 The Team 👨‍💻

DGP is developed and currently maintained by Quincy Chen, Arjun Bhargava, Chao Fang, Chris Ochoa and Kuan-Hui Lee from ML-Engineering team at Toyota Research Institute (TRI), with contributions coming from ML-Research team at TRI, Woven Planet and Parallel Domain.