Awesome
<div align="center"> <h1><a href="https://arxiv.org/abs/2407.07582">TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data (ECCV 2024)</a></h1>Siyi Du, Shaoming Zheng, Yinsong Wang, Wenjia Bai, Declan P. O'Regan, and Chen Qin
</div> <p align="center">Model architecture and algorithm of TIP: (a) Model overview with its image encoder, tabular encoder, and multimodal interaction module, which are pre-trained using 3 SSL losses: $\mathcal{L}_{itc}$, $\mathcal{L}_{itm}$, and $\mathcal{L}_{mtr}$. (b) Model details for (b-1) $\mathcal{L}_{itm}$ and $\mathcal{L}_{mtr}$ calculation and (b-2) tabular embedding with missing data. (c) Pre-training algorithm.</p>This is an official PyTorch implementation for TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data, ECCV 2024. We built the code based on paulhager/MMCL-Tabular-Imaging.
Concact: s.du23@imperial.ac.uk (Siyi Du)
Share us a :star: if this repository does help.
Updates
[11/07/2024] The arXiv paper is released.
[08/07/2024] The code is released.
[23/10/2024] The preprocessing code for UKBB is released.
Contents
Requirements
This code is implemented using Python 3.9.15, PyTorch 1.11.0, PyTorch-lighting 1.6.4, CUDA 11.3.1, and CuDNN 8.
cd TIP/
conda env create --file environment.yaml
conda activate tip
Data
Download DVM data from here
Apply for the UKBB data here
Preparation
DVM
- Execute data/create_dvm_dataset.ipynb to get train, val, test datasets.
- Execute data/image2numpy.ipynb to convert jpg images to numpy format for faster reading during training.
- Execute data/create_missing_mask.ipynb to create missing masks (RVM, RFM, MIFM, LIFM) for incomplete data fine-tuning experiments.
UKBB
- Execute data/preprocess_ukbb/filter_cardiac_tabular_feature.py to get cardiac disease related tabular features.
- Execute data/preprocess_ukbb/preprocess_cardiac_table.ipynb to preprocess filtered tabular features and generate labels.
- Execute data/preprocess_ukbb/create_image_tabular_split.ipynb to get train, val, test datasets.
- Execute data/preprocess_ukbb/preprocess_cardiac_image.py to prepare Numpy images for training
Training
Pre-training & Fine-tuning
CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=pretrain
Fine-tuning
CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=finetune pretrain=False evaluate=True checkpoint={YOUR_PRETRAINED_CKPT_PATH}
Fine-tuning with incomplete data
CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=missing pretrain=False evaluate=True checkpoint={YOUR_PRETRAINED_CKPT_PATH} missing_tabular=True missing_strategy=value missing_rate=0.3
Checkpoints
Pre-trained Checkpoints
Datasets | DVM | Cardiac |
---|---|---|
Checkpoints | Download | Download |
Fine-tuned Checkpoints
Task | Linear-probing | Fully fine-tuning |
---|---|---|
Car model prediction (DVM) | Download | Download |
CAD classification (Cardiac) | Download | Download |
Infarction classification (Cardiac) | Download | Download |
Lisence & Citation
This repository is licensed under the Apache License, Version 2.
If you use this code in your research, please consider citing:
@inproceedings{du2024tip,
title={{TIP}: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data},
author={Du, Siyi and Zheng, Shaoming and Wang, Yinsong and Bai, Wenjia and O'Regan, Declan P. and Qin, Chen},
booktitle={18th European Conference on Computer Vision (ECCV 2024)},
year={2024}
Acknowledgements
We would like to thank the following repositories for their great works: