Awesome
Introduction
Vedatad is a single stage temporal action detection toolbox based on PyTorch. Vedatad includes implementations of the following temporal action detection algorithms:
Features
-
Modular Design
We decompose detector into four parts: data pipeline, model, postprocessing and criterion which make it easy to convert PyTorch model into TensorRT engine and deploy it on NVIDIA devices such as Tesla V100, Jetson Nano and Jetson AGX Xavier, etc.
-
Support of several popular single stage detector
The toolbox supports several single stage detector out of the box, e.g. DaoTAD, etc.
License
This project is released under the Apache 2.0 license.
Installation
Requirements
- Linux
- Python 3.7+
- PyTorch 1.7.0 or higher
- CUDA 10.2 or higher
- ffmpeg
We have tested the following versions of OS and softwares:
- OS: Ubuntu 16.04.6 LTS
- CUDA: 10.2
- PyTorch 1.8.0
- Python 3.8.5
- ffmpeg 4.3.11
Install vedatad
a. Create a conda virtual environment and activate it.
conda create -n vedatad python=3.8.5 -y
conda activate vedatad
b. Install PyTorch and torchvision following the official instructions, e.g.,
conda install pytorch torchvision -c pytorch
c. Clone the vedatad repository.
git clone https://github.com/Media-Smart/vedatad.git
cd vedatad
vedatad_root=${PWD}
d. Install vedatad.
pip install -r requirements/build.txt
pip install -v -e .
Data preparation
Please follow specified algorithm in config/trainval
to prepare data, for example, see detail in configs/trainval/daotad
.
Train
a. Config
Modify some configuration accordingly in the config file like configs/trainval/daotad/daotad_i3d_r50_e700_thumos14_rgb.py
b. Train
tools/dist_trainval.sh configs/trainval/daotad/daotad_i3d_r50_e700_thumos14_rgb.py "0,1,2,3"
Test
a. Config
Modify some configuration accordingly in the config file like configs/trainval/daotad/daotad_i3d_r50_e700_thumos14_rgb.py
b. Test
CUDA_VISIBLE_DEVICES=0 python tools/test.py configs/trainval/daotad/daotad_i3d_r50_e700_thumos14_rgb.py weight_path
Contact
This repository is currently maintained by Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone), Chenhao Wang (@C-H-Wong).
Credits
We got a lot of code from vedadet, thanks to Media-Smart.