Awesome
Enhancing Recipe Retrieval with Foundation Models
Official implementation of our ECCV2024 paper:
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
This paper proposes a new perspective on data augmentation using the Foundation Model (i.e., llama2 and SAM) to better learn multimodal representations in the common embedding space for the task of cross-modal recipe retrieval.
Installation
To install the required packages, please follow these steps:
# Clone the repository
git clone https://github.com/Noah888/DAR.git
# Create a virtual environment (Python 3.8 or above)
conda create --name your_env_name python=3.9
# Activate the conda environment
conda activate your_env_nam
# Install dependencies
pip install -r requirements.txt
cd src
Dataset
To reproduce the results, Download Recipe1M dataset and Generate enhanced data (traindata (add visual imagination data) and segment data). Place the data in the DATASET_PATH
directory with the following structure:
DATASET_PATH/
│── traindata/
├── train/
│ ├── ...
├── val/
│ ├── ...
└── test/
│ ├── ...
├── segment/
│ ├── train/...
│ ├── val/...
│ ├── test/...
└── layer1.json
└── layer2.json
Training
- Launch training with:
python train.py --model_name model --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints
Run python train.py --help
for the full list of available arguments.
Evaluation
- Extract features from the trained model for the test set samples of Recipe1M:
python test.py --model_name model --eval_split test --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints
- Compute MedR and recall metrics for the extracted feature set: Evaluation with only image and recipe feats(DAR):
python eval.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000
Evaluation with raw image-recipe features as well as augment segments description features (DAR++):
python eval_add_augment.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000
Pretrained models
- We provide pretrained model weights DAR_model:
python test.py --model_name DAR_model --eval_split test --root DATASET_PATH --save_dir ../checkpoints
- A file with extracted features will be saved under
../checkpoints/DAR_model
.
This code is based on the image-to-recipe-transformers. We would like to express our gratitude.