Home

Awesome

Shape-aware Text-driven Layered Video Editing [CVPR 2023]

Yao-Chih Lee, Ji-Ze G. Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang

[Webpage] [Paper]

Environment

git clone --recursive https://github.com/text-video-edit/shape-aware-text-driven-layered-video-editing-release.git 
pip install -r requirements.txt
./scripts/setup_esrgan.sh

Data structure

For an input video, the required data and structure are listed below. The NLA's checkpoint and configuration files are obtained by Layered Neural Atlases.

DATA_DIR/
├── images/
│   └── *.png or *.jpg
├── masks
│   └── *.png 
└── pretrained_nla_models
    ├── checkpoint
    └── config.json

Each edit case will be saved in EDIT_DIR, which is put under the DATA_DIR. We provided some examples in data directory.

For instance, DATA_DIR=data/car-turn and EDIT_DIR=data/car-turn/edit_sports_car.

Running

Acknowledgements

We thank the authors for releasing Layered Neural Atlases, Text2LIVE, Stable-DreamFusion, and Real-ESRGAN.