Home

Awesome

MAR: Masked Autoencoders for Efficient Action Recognition

Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Xiang Wang, Yiliang Lv, Changxin Gao, Nong Sang <br/> [Paper].

<br/> <div align="center"> <img src="framework.png" /> </div> <br/>

Latest

[2022-11] Codes are available!

This repo is a modification on the TAdaConv repo.

Installation

Requirements:

optional requirements

Guidelines

Installation, data preparation and running

The general pipeline for using this repo is the installation, data preparation and running. See GUIDELINES.md.

Getting Pre-trained Checkpoints

You can download the Video-MAE pre-trained checkpoints from here. Next please use this simple python script to convert the pre-trained checkpoints to adapt to our code base. Then you need modify the TRAIN.CHECKPOINT_FILE_PATH to the converted checkpoints for fine-tuning.

Running instructions

<!-- To train the model with MAR, set the `_BASE_RUN` to point to `configs/pool/run/training/simclr.yaml`. See `configs/projects/hico/simclr_*_s3dg.yaml` for more details. Alternatively, you can also find some pre-trained model in the `MODEL_ZOO.md`. -->

For detailed explanations on the approach itself, please refer to the paper.

For an example run, set the DATA_ROOT_DIR, ANNO_DIR, TRAIN.CHECKPOINT_FILE_PATH and OUTPUT_DIR in configs\projects\mar\ft-ssv2\vit_base_50%.yaml, and run the command for the training:

python tools/run_net.py --cfg configs/projects/mar/ft-ssv2/vit_base_50%.yaml

Citing MAR

If you find MAR useful for your research, please consider citing the paper as follows:

@article{qing2022mar,
  title={Mar: Masked autoencoders for efficient action recognition},
  author={Qing, Zhiwu and Zhang, Shiwei and Huang, Ziyuan and Wang, Xiang and Wang, Yuehuan and Lv, Yiliang and Gao, Changxin and Sang, Nong},
  journal={arXiv preprint arXiv:2207.11660},
  year={2022}
}