Home

Awesome

recipe1m.bootstrap.pytorch

We are a Machine Learning research team from Sorbonne University. Our goal for this project was to create a cross-modal retrieval system trained on the biggest dataset of cooking recipes. This kind of systems is able to retrieve the corresponding recipe given an image (food selfie), and the corresponding image from the recipe.

It was also the occasion to compare several state-of-the-art metric learning loss functions in a new context. This first analysis gave us some idea on how to improve the generalization of our model. Following this, we wrote two research papers on a new model, called Adamine after Adaptive Mining, that add structure in the retrieval space:

Summary:

Introduction

Recipe-to-Image retrieval task

<p align="center"> <img src="https://github.com/Cadene/recipe1m.bootstrap.pytorch/raw/master/images/task.png" width="800"/> </p>

Given a list of ingredients and a sequence of cooking instructions, the goal is to train a statistical model to retrieve the associated image. For each recipe, the top row indicates the top 5 images retrieved by our AdaMine model, and the bottom row, by a strong baseline.

Quick insight about AdaMine

<p align="center"> <img src="https://github.com/Cadene/recipe1m.bootstrap.pytorch/raw/master/images/model.png" width="500"/> </p>

Features embedding

Metric learning:

Negative sampling strategy

Installation

1. Install python 3

We don't provide support for python 2. We advise you to install python 3 with Anaconda. Then, you can create an environment.

conda create --name recipe1m python=3.7
source activate recipe1m

2. Clone & requirements

We use a high level framework to be able to focus on the model instead of boilerplate code.

cd $HOME
git clone https://github.com/Cadene/recipe1m.bootstrap.pytorch.git
cd recipe1m.bootstrap.pytorch
pip install -r requirements.txt

3. Download dataset

Please, create an account on http://im2recipe.csail.mit.edu/ and agree to the terms of use. This dataset was made for research and not for commercial use.

mkdir data/recip1m
cd data/recip1m
tar -xvf data_lmdb.tar
rm data_lmdb.tar
tar -xzvf recipe1M.tar.gz
rm recipe1M.tar.gz
tar -xzvf text.tar.gz
rm text.tar.gz
cd text

Note: Features extracted from resnet50 are included in data_lmdb.

Quick start

Train a model on the train/val sets

The boostrap/run.py file load the options contained in a yaml file, create the corresponding experiment directory (in logs/recipe1m) and start the training procedure.

For instance, you can train our best model by running:

python -m bootstrap.run -o recipe1m/options/adamine.yaml

Then, several files are going to be created:

Many loss functions are available in the recipe1m/options directory.

Evaluate a model on the test set

At the end of the training procedure, you can evaluate your model on the testing set. In this example, boostrap/run.py load the options from your experiment directory, resume the best checkpoint on the validation set and start an evaluation on the testing set instead of the validation set while skipping the training set (train_split is empty).

python -m bootstrap.run \
-o logs/recipe1m/adamine/options.yaml \
--exp.resume best_eval_epoch.metric.recall_at_1_im2recipe_mean \
--dataset.train_split \
--dataset.eval_split test

Note: by default, the model is evaluated on the 1k setup; more info on the 10k setup here

Available (pretrained) models

PWC

Pairwise loss [paper]

python -m bootstrap.run -o recipe1m/options/pairwise.yaml

PWC++ (Ours)

Pairwise with positive and negative margins loss

python -m bootstrap.run -o recipe1m/options/pairwise_plus.yaml

VSE

Triplet loss (VSE) [paper]

python -m bootstrap.run -o recipe1m/options/avg_nosem.yaml

VSE++

Triplet loss with hard negative mining [paper]

python -m bootstrap.run -o recipe1m/options/max.yaml

AdaMine_avg (Ours)

Triplet loss with semantic loss

python -m bootstrap.run -o recipe1m/options/avg.yaml

AdaMine (Ours)

Triplet loss with semantic loss and adaptive sampling

python -m bootstrap.run -o recipe1m/options/adamine.yaml

Features from testing set:

cd logs/recipe1m
wget http://data.lip6.fr/cadene/im2recipe/logs/adamine.tar.gz
tar -xzvf adamine.tar.gz

Lifted structure

Lifted structure loss [paper]

python -m bootstrap.run -o recipe1m/options/lifted_struct.yaml

Documentation

TODO

Useful commands

Compare experiments

python -m bootstrap.compare -d \
logs/recipe1m/adamine \
logs/recipe1m/avg \
-k eval_epoch.metric.recall_at_1_im2recipe_mean max

Results:

## eval_epoch.metric.recall_at_1_im2recipe_mean

  Place  Method      Score    Epoch
-------  --------  -------  -------
      1  adamine    0.3827       76
      2  avg        0.3201       51

Use a specific GPU

CUDA_VISIBLE_DEVICES=0 python -m boostrap.run -o options/recipe1m/adamine.yaml

Overwrite an option

The boostrap.pytorch framework makes it easy to overwrite a hyperparameter. In this example, I run an experiment with a non-default learning rate. Thus, I also overwrite the experiment directory path:

python -m bootstrap.run -o recipe1m/options/adamine.yaml \
--optimizer.lr 0.0003 \
--exp.dir logs/recipe1m/adamine_lr,0.0003

Resume training

If a problem occurs, it is easy to resume the last epoch by specifying the options file from the experiment directory while overwritting the exp.resume option (default is None):

python -m bootstrap.run -o logs/recipe1m/adamine/options.yaml \
--exp.resume last

Evaluate with the 10k setup

Just as with the 1k setup, we load the best checkpoint. This time we also overwrite some options. The metrics will be displayed on your terminal at the end of the evaluation.

python -m bootstrap.run \
-o logs/recipe1m/adamine/options.yaml \
--exp.resume best_eval_epoch.metric.recall_at_1_im2recipe_mean \
--dataset.train_split \
--dataset.eval_split test \
--model.metric.nb_bags 5 \
--model.metric.nb_matchs_per_bag 10000

Note: Metrics can be stored in a json file by adding the --misc.logs_name eval,test10k option. It will create a logs_eval,test10k.json in your experiment directory.

API

TODO

Extract your own image features

TODO

Citation

@inproceddings{carvalho2018cross,
  title={Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings},
  author={Carvalho, Micael and Cad{\`e}ne, R{\'e}mi and Picard, David and Soulier, Laure and Thome, Nicolas and Cord, Matthieu},
  booktitle={The ACM conference on Research and Development in Information Retrieval (SIGIR)},
  year={2018},
  url={https://arxiv.org/abs/1804.11146}
}

Acknowledgment

Special thanks to the authors of im2recipe who developped Recip1M, the dataset used in this research project.