Awesome

Object State Change Classification

This is the imlementation of submission (TarHeels) for the Ego4D: Object State Change Classification Challenge at 1st Ego4D Workshop, CVPR 2022. We use a transformer-based video recognition model and leverage the Divided Space-Time Attention mechanism for classifying object state change in egocentric videos. Our submission achieves the second-best performance in the challenge.

Technical Report

You can download the technical report of our submission from here.

Steps to run the codebase

Follow the instruction from timeSformer for setup and installation.
Run create_fho_clips.py for processing and creating video clips.
Run create_fho_dataset.py for creating the dataset.
Use following command to run the train the model.

python tools/run_net.py \
 --cfg configs/Ego4dFho/TimeSformer_divST_8x32_224.yaml \
 DATA.PATH_TO_DATA_DIR path_to_your_dataset \
 NUM_GPUS 8 \
 TRAIN.BATCH_SIZE 8 \

Finally, run generate_submission.py to generate submission file for the challenge.