Home

Awesome

THU<sup>MV-EACT</sup>-50: A Large-Scale Multi-View Event-Based Action Recognition Benchmark

News Update: The dataset is now available! You can download it from THU<sup>MV-EACT</sup>-50.

Introduced by the paper, "Hypergraph-Based Multi-View Action Recognition Using Event Cameras" in TPAMI 2024, THU<sup>MV-EACT</sup>-50 is a pioneering large-scale multi-view dataset for event-based action recognition, aiming to fulfill the gap in existing datasets which are often limited in action categories, data scale, and lack complexity for practical applications. The dataset is an extension of the single-view THU<sup>E-ACT</sup>-50, providing a multi-view benchmark that enhances the dataset's applicability in real-world scenarios.

<img src="figures/dataset-v2.png" alt="Sample-sequences" style="zoom: 33%;" />

Dataset Overview

THU<sup>MV-EACT</sup>-50 stands out as the first multi-view dataset specifically designed for the event-based action recognition task. It incorporates the same 50 action categories as its predecessor, THU<sup>E-ACT</sup>-50, but extends the content to include 31,500 video recordings from 6 distinct viewpoints, offering a resolution of 1280x800.

The dataset was collected in an indoor venue approximately 100m² using CeleX-V cameras arranged to capture 6 viewpoints (4 frontal and 2 backward) of the action performer. The setup ensures comprehensive coverage of each action from multiple angles, enhancing the dataset's utility for multi-view action recognition tasks.

<img src="figures/dataset-v2-env.png" alt="Sample-sequences" style="zoom: 80%;" />

Dataset Highlights

List of Actions

IDActionIDActionIDActionIDActionIDAction
A0WalkingA10Cross armsA20Calling with phoneA30FanA40Check time
A1RunningA11SaluteA21ReadingA31Open umbrellaA41Drink water
A2Jump upA12Squat downA22Tai chiA32Close umbrellaA42Wipe face
A3Running in circlesA13Sit downA23Swing objectsA33Put on glassesA43Long jump
A4Falling downA14Stand upA24ThrowA34Take off glassesA44Push up
A5Waving one handA15Sit and standA25StaggeringA35Pick upA45Sit up
A6Waving two handsA16Knead faceA26HeadacheA36Put on bagA46Shake hands (two-players)
A7ClapA17Nod headA27StomachacheA37Take off bagA47Fighting (two-players)
A8Rub handsA18Shake headA28Back painA38Put object into bagA48Handing objects (two-players)
A9PunchA19Thumb upA29VomitA39Take object out of bagA49Lifting chairs (two-players)

Evaluation Criteria

The dataset employs Top-1, Top-3, and Top-5 accuracy metrics for evaluating the performance of event-based action recognition methods. It supports both cross-subject and cross-view experimental settings to assess the generalization ability of proposed methods across different subjects and unseen viewpoints.

Dataset Download

The dataset is now available at THU<sup>MV-EACT</sup>-50.

Note: After decompression, the dataset will require about 1.1TB of storage space.

Dataset Format

The event data is provided in the .csv format, the data is structured with 5 columns as follows:

Cross-Subject

In the Cross-Subject setting, the dataset is divided in a way that ensures the subjects in the training, validation, and test sets are mutually exclusive.

The data for each set is provided in specific pickle files, named accordingly:

Cross-View

The Cross-View setting addresses a different aspect of generalizability: the model's ability to recognize actions from unseen viewpoints. The dataset division for Cross-View experiments is as follows:

For the Cross-View experimental setup, the respective pickle files are named:

The preprocessing operations for the 2 modes can be found in dataset.py.

Acknowledgements

We would like to express our sincere gratitude to Tsinghua University, partner companies, and organizations for their invaluable support and collaboration in making this dataset possible. Additionally, we extend our thanks to all the volunteers who participated in the data collection process. Their contributions have been instrumental in the development and evaluation of this benchmark.

License

This dataset is licensed under the MIT License.

Citing Our Work

If you find this dataset beneficial for your research, please cite our works:

@article{gao2024hypergraph,
  title={Hypergraph-Based Multi-View Action Recognition Using Event Cameras},
  author={Gao, Yue and Lu, Jiaxuan and Li, Siqi and Li, Yipeng and Du, Shaoyi},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2024},
  volume={46},
  number={10},
  pages={6610-6622},
  publisher={IEEE}
}

@article{gao2023action,
  title={Action Recognition and Benchmark Using Event Cameras},
  author={Gao, Yue and Lu, Jiaxuan and Li, Siqi and Ma, Nan and Du, Shaoyi and Li, Yipeng and Dai, Qionghai},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2023},
  volume={45},
  number={12},
  pages={14081-14097},
  publisher={IEEE}
}