Awesome
(MI)llion-(M)ax(M)argin (MI-MM)
Code and Data of the MIllion-MaxMargin model used as baseline for the EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge.
The MI-MM model is a modified version of the model used in End-to-End Learning of Visual Representations from Uncurated Instructional Videos. The main changes are three:
- Using tho Gated Embedding Unit instead of a simple Linear layer to project the features pace into the embedding space (read Learning a Text-Video Embedding from Incomplete and Heterogeneous Data to know more about the Gated Embedding Unit).
- Increasing the size of the vocabulary due to the lack of wsome relevant words.
- Using the Multi-instance MaxMargin loss instead of the MIL-NCE loss (read Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings to know more about the Multi-instance MaxMargin loss).
Data
The data directory contains 5 folders:
dataframes
: It contains the train and validation csv files of EPIC-Kitchens.features
: It contains the training and validation features of EPIC-Kitchens extracted by the S3D model trained on HowTo100M whose implementation can be found here. The order of the videos is the same as in the csv files you can find in thedataframes
folder.models
: It contains the weights of the MI-MM model.relevancy
: It contains the training relevancy matrix needed to train the model.resources
: It contains the weights of the S3D model trained on HowTo100M and the word embeddings.
You can download the data directory from here
How to use it
Requirements
You can install a conda enviroment using conda env create -f environment.yml
.
Training
You can train the model using the default values defined in src/utils/utils.py
by running the the following command: python training.py
. You can run python training.py --help
to see the full list of arguments you can pass to the script.
During training you can check the loss value of each epoch in data/models/logs
, where you can find train.txt
and val.txt
. Moreover, you can inspect the training and testing loss curves by running tensorboard --logdir=data/models/logs/runs
.
After training the model, you can find 2 different weigths in data/models
:
model_best.pth.tar
: This file contains the weights of the model of the epoch with the lowest validation loss value.checkpoint.pth.tar
: This file contains the weights of the model of the last epoch.
Testing
You can evaluate the model and create the submission file by running python testing.py
. This will evaluate the model using the default value model_best.pth.tar
, but you can select checkpoint.pth.tar
by running python testing.py --best-model=checkpoint.pth.tar
.
After testing, you can find the submission file in output/test.pkl
.
Other details on the submission can be found here.