Awesome

Temporal Logic Video (TLV) Dataset

</div>  <br /> <div align="center"> <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"> <img src="images/logo.png" alt="Logo" width="240" height="240"> </a> <h3 align="center">Temporal Logic Video (TLV) Dataset</h3> <p align="center"> Synthetic and real video dataset with temporal logic annotation <br /> <a href="https://github.com/UTAustin-SwarmLab/temporal-logic-video-dataset"><strong>Explore the docs »</strong></a> <br /> <br /> <a href="https://anoymousu1.github.io/nsvs-anonymous.github.io/">NSVS-TL Project Webpage</a> · <a href="https://github.com/UTAustin-SwarmLab/Neuro-Symbolic-Video-Search-Temploral-Logic">NSVS-TL Source Code</a> </p> </div>

Overview

The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:

Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.

Dataset Composition
Dataset (Release)
Installation
Usage
Data Generation
Contribution Guidelines
License
Acknowledgments

Dataset Composition

Synthetic Datasets

Source: COCO and ImageNet
Purpose: Introduce artificial Temporal Logic specifications
Generation Method: Image stitching from static datasets

Real-world Datasets

Sources: NuScenes and Waymo
Purpose: Provide real-world autonomous vehicle scenarios
Annotation: Temporal Logic specifications added to existing data

Dataset

Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.

Dataset Structure

We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.

File Naming Convention

\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl

Object Attributes

Each serialized object contains the following attributes:

ground_truth: Boolean indicating whether the dataset contains ground truth labels
ltl_formula: Temporal logic formula applied to the dataset
proposition: A set of proposition for ltl_formula
number_of_frame: Total number of frames in the dataset
frames_of_interest: Frames of interest which satisfy the ltl_formula
labels_of_frames: Labels for each frame
images_of_frames: Image data for each frame

You can download a dataset from here. The structure of dataset is as follows: serializer

tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/

Dataset Statistics

Total Number of Frames

Ground Truth TL Specifications	Synthetic TLV Dataset		Real TLV Dataset
	COCO	ImageNet	Waymo	Nuscenes
Eventually Event A	-	15,750	-	-
Always Event A	-	15,750	-	-
Event A And Event B	31,500	-	-	-
Event A Until Event B	15,750	15,750	8,736	19,808
(Event A And Event B) Until Event C	5,789	-	7,459	7,459

Total Number of datasets

Ground Truth TL Specifications	Synthetic TLV Dataset		Real TLV Dataset
	COCO	ImageNet	Waymo	Nuscenes
Eventually Event A	-	60	-	-
Always Event A	-	60	-	-
Event A And Event B	120	-	-	-
Event A Until Event B	60	60	45	494
(Event A And Event B) Until Event C	97	-	30	186

Installation

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"

Prerequisites

ImageNet (ILSVRC 2017):

ILSVRC/
├── Annotations/
├── Data/
├── ImageSets/
└── LOC_synset_mapping.txt

COCO (2017):

COCO/
└── 2017/
    ├── annotations/
    ├── train2017/
    └── val2017/

Usage

Detailed usage instructions for data loading and processing.

Data Loader Configuration

data_root_dir: Root directory of the dataset
mapping_to: Label mapping scheme (default: "coco")
save_dir: Output directory for processed data

Synthetic Data Generator Configuration

initial_number_of_frame: Starting frame count per video
max_number_frame: Maximum frame count per video
number_video_per_set_of_frame: Videos to generate per frame set
increase_rate: Frame count increment rate
ltl_logic: Temporal Logic specification (e.g., "F prop1", "G prop1")
save_images: Boolean flag for saving individual frames

Data Generation

COCO Synthetic Data Generation

python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"

ImageNet Synthetic Data Generation

python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"

Note: ImageNet generator does not support '&' LTL logic formulae.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Connect with Me

<p align="center"> <em>Feel free to connect with me through these professional channels:</em> <p align="center"> <a href="https://www.linkedin.com/in/mchoi07/" target="_blank"><img src="https://img.shields.io/badge/-LinkedIn-0077B5?style=flat-square&logo=Linkedin&logoColor=white" alt="LinkedIn"/></a> <a href="mailto:minkyu.choi@utexas.edu"><img src="https://img.shields.io/badge/-Email-D14836?style=flat-square&logo=Gmail&logoColor=white" alt="Email"/></a> <a href="https://scholar.google.com/citations?user=ai4daB8AAAAJ&hl" target="_blank"><img src="https://img.shields.io/badge/-Google%20Scholar-4285F4?style=flat-square&logo=google-scholar&logoColor=white" alt="Google Scholar"/></a> <a href="https://minkyuchoi-07.github.io" target="_blank"><img src="https://img.shields.io/badge/-Website-00C7B7?style=flat-square&logo=Internet-Explorer&logoColor=white" alt="Website"/></a> <a href="https://x.com/MinkyuChoi7" target="_blank"><img src="https://img.shields.io/badge/-Twitter-1DA1F2?style=flat-square&logo=Twitter&logoColor=white" alt="Twitter"/></a> </p>

Citation

If you find this repo useful, please cite our paper:

@inproceedings{Choi_2024_ECCV,
  author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
  title={Towards Neuro-Symbolic Video Understanding},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  month={September},
  year={2024}
}