Home

Awesome

ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization

Project Page | Paper (ArXiv) | Supplemental Material | Code (Github)

google colab logo Hugging Face visitorsGitHub Stars

This repository is the official pytorch implementation of our paper, ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization.

Yixin Yang, Jiangxin Dong, Jinhui Tang, Jinshan Pan <br>

Nanjing University of Science and Technology

šŸ”„ News

<!-- - [2024-09-01] Integrated to :hugs: [Hugging Face](https://huggingface.co/spaces). Try out online demo! [![Hugging Face](https://img.shields.io/badge/Demo-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/yyang181/ColorMNet) -->

Requirements

:briefcase: Dependencies and Installation

# git clone this repository

conda create -n colormnet python=3.8 
conda activate colormnet 

pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118

# install py-thin-plate-spline
git clone https://github.com/cheind/py-thin-plate-spline.git
cd py-thin-plate-spline && pip install -e . && cd ..

# install Pytorch-Correlation-extension
git clone https://github.com/ClementPinard/Pytorch-Correlation-extension.git 
cd Pytorch-Correlation-extension && python setup.py install && cd ..

pip install -r requirements.txt

:gift: Checkpoints

Download the pretrained models manually and put them in ./saves (create the folder if it doesn't exist).

NameURL
ColorMNetmodel

:zap: Quick Inference

CUDA_VISIBLE_DEVICES=0 python test.py 
# Add --FirstFrameIsNotExemplar if the reference frame is not exactly the first input image. Please make sure the ref frame and the input frames are of the same size. 

Train

Dataset structure for both the training set and the validation set

# Specify --davis_root and --validation_root
data_root/
ā”œā”€ā”€ 001/
ā”‚   ā”œā”€ā”€ 00000.png
ā”‚   ā”œā”€ā”€ 00001.png
ā”‚   ā”œā”€ā”€ 00002.png
ā”‚   ā””ā”€ā”€ ...
ā”œā”€ā”€ 002/
ā”‚   ā”œā”€ā”€ 00000.png
ā”‚   ā”œā”€ā”€ 00001.png
ā”‚   ā”œā”€ā”€ 00002.png
ā”‚   ā””ā”€ā”€ ...
ā””ā”€ā”€ ...

Training script

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.run \
    --master_port 25205 \
    --nproc_per_node=1 \
    train.py \
    --exp_id DINOv2FeatureV6_LocalAtten_DAVISVidevo \
    --davis_root /path/to/your/training/data/\
    --validation_root /path/to/your/validation/data\
    --savepath ./wandb_save_dir

To Do

Citation

If our work is useful for your research, please consider citing:

@inproceedings{yang2024colormnet,
    author = {Yang, Yixin and Dong, Jiangxin and Tang, Jinhui and Pan Jinshan},
    title = {ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization},
    booktitle = {ECCV},
    year = {2024}
}

License

This project is licensed under <a rel="license" href="https://github.com/yyang181/colormnet/blob/main/LICENSE">BY-NC-SA 4.0</a>, while some methods adopted in this project are with other licenses. Please refer to LICENSES.md for the careful check. Redistribution and use should follow this license.

Acknowledgement

This project is based on XMem. Some codes are brought from DINOv2. Thanks for their awesome works.

Contact

This repo is currently maintained by Yixin Yang (@yyang181) and is for academic research use only.