Home

Awesome

FlashAvatar

Paper|Project Page

teaser Given a monocular video sequence, our proposed FlashAvatar can reconstruct a high-fidelity digital avatar in minutes which can be animated and rendered over 300FPS at the resolution of 512×512 with an Nvidia RTX 3090.

Setup

This code has been tested on Nvidia RTX 3090.

Create the environment:

conda env create --file environment.yml
conda activate FlashAvatar

Install PyTorch3D:

conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d

Data Convention

The data is organized in the following form:

dataset
├── <id1_name>
    ├── alpha # raw alpha prediction
    ├── imgs # extracted video frames
    ├── parsing # semantic segmentation
├── <id2_name>
...
metrical-tracker
├── output
    ├── <id1_name>
        ├── checkpoint
    ├── <id2_name>
...

Running

python test.py --idname <id_name> --checkpoint dataset/<id_name>/log/ckpt/chkpnt.pth
python train.py --idname <id_name>

Download the example with pre-processed data and pre-trained model for a try!

Citation

@inproceedings{xiang2024flashavatar,
      author    = {Jun Xiang and Xuan Gao and Yudong Guo and Juyong Zhang},
      title     = {FlashAvatar: High-fidelity Head Avatar with Efficient Gaussian Embedding},
      booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
      year      = {2024},
  }