Home

Awesome

<div align="center">

Temporal Logic Video (TLV) Dataset

arXiv Paper Website GitHub GitHub

</div> <!-- PROJECT LOGO --> <br /> <div align="center"> <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"> <img src="images/logo.png" alt="Logo" width="240" height="240"> </a> <h3 align="center">Temporal Logic Video (TLV) Dataset</h3> <p align="center"> Synthetic and real video dataset with temporal logic annotation <br /> <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"><strong>Explore the docs »</strong></a> <br /> <br /> <a href="https://anoymousu1.github.io/nsvs-anonymous.github.io/">NSVS-TL Project Webpage</a> · <a href="https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temploral-Logic">NSVS-TL Source Code</a> </p> </div>

Overview

The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:

  1. Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
  2. Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.

Table of Contents

Dataset Composition

Synthetic Datasets

Real-world Datasets

Dataset

<div align="center"> <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"> <img src="images/teaser.png" alt="Logo" width="840" height="440"> </a> </div>

Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.

Dataset Structure

We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.

File Naming Convention

\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl

Object Attributes

Each serialized object contains the following attributes:

You can download a dataset from here. The structure of dataset is as follows: serializer

tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/

Dataset Statistics

  1. Total Number of Frames
Ground Truth TL SpecificationsSynthetic TLV DatasetReal TLV Dataset
COCOImageNetWaymoNuscenes
Eventually Event A-15,750--
Always Event A-15,750--
Event A And Event B31,500---
Event A Until Event B15,75015,7508,73619,808
(Event A And Event B) Until Event C5,789-7,4597,459
  1. Total Number of datasets
Ground Truth TL SpecificationsSynthetic TLV DatasetReal TLV Dataset
COCOImageNetWaymoNuscenes
Eventually Event A-60--
Always Event A-60--
Event A And Event B120---
Event A Until Event B606045494
(Event A And Event B) Until Event C97-30186

Installation

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"

Prerequisites

  1. ImageNet (ILSVRC 2017):

    ILSVRC/
    ├── Annotations/
    ├── Data/
    ├── ImageSets/
    └── LOC_synset_mapping.txt
    
  2. COCO (2017):

    COCO/
    └── 2017/
        ├── annotations/
        ├── train2017/
        └── val2017/
    

Usage

Detailed usage instructions for data loading and processing.

Data Loader Configuration

Synthetic Data Generator Configuration

Data Generation

COCO Synthetic Data Generation

python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"

ImageNet Synthetic Data Generation

python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"

Note: ImageNet generator does not support '&' LTL logic formulae.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Connect with Me

<p align="center"> <em>Feel free to connect with me through these professional channels:</em> <p align="center"> <a href="https://www.linkedin.com/in/mchoi07/" target="_blank"><img src="https://img.shields.io/badge/-LinkedIn-0077B5?style=flat-square&logo=Linkedin&logoColor=white" alt="LinkedIn"/></a> <a href="mailto:minkyu.choi@utexas.edu"><img src="https://img.shields.io/badge/-Email-D14836?style=flat-square&logo=Gmail&logoColor=white" alt="Email"/></a> <a href="https://scholar.google.com/citations?user=ai4daB8AAAAJ&hl" target="_blank"><img src="https://img.shields.io/badge/-Google%20Scholar-4285F4?style=flat-square&logo=google-scholar&logoColor=white" alt="Google Scholar"/></a> <a href="https://minkyuchoi-07.github.io" target="_blank"><img src="https://img.shields.io/badge/-Website-00C7B7?style=flat-square&logo=Internet-Explorer&logoColor=white" alt="Website"/></a> <a href="https://x.com/MinkyuChoi7" target="_blank"><img src="https://img.shields.io/badge/-Twitter-1DA1F2?style=flat-square&logo=Twitter&logoColor=white" alt="Twitter"/></a> </p>

Citation

If you find this repo useful, please cite our paper:

@inproceedings{Choi_2024_ECCV,
  author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
  title={Towards Neuro-Symbolic Video Understanding},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  month={September},
  year={2024}
}