Home

Awesome

<p align="center"> <h2 align="center">Animate-X: Universal Character Image Animation with Enhanced Motion Representation</h2> <p align="center"> <a href=""><strong>Shuai Tan</strong></a> · <a href="https://scholar.google.com/citations?user=BwdpTiQAAAAJ"><strong>Biao Gong</strong></a><sup>†</sup> · <a href="https://scholar.google.com/citations?user=cQbXvkcAAAAJ"><strong>Xiang Wang</strong></a> · <a href="https://scholar.google.com/citations?user=ZO3OQ-8AAAAJ"><strong>Shiwei Zhang</strong></a> <br> <a href="https://openreview.net/profile?id=~DanDan_Zheng1"><strong>Dandan Zheng</strong></a> · <a href="https://scholar.google.com.hk/citations?user=S8FmqTUAAAAJ"><strong>Ruobing Zheng</strong></a> · <a href="https://scholar.google.com/citations?user=hMDQifQAAAAJ"><strong>Kecheng Zheng</strong></a> · <a href="https://openreview.net/profile?id=~Jingdong_Chen1"><strong>Jingdong Chen</strong></a> · <a href="https://openreview.net/profile?id=~Ming_Yang2"><strong>Ming Yang</strong></a> <br> <br> <a href="https://arxiv.org/abs/2410.10306"><img src='https://img.shields.io/badge/arXiv-Animate--X-red' alt='Paper PDF'></a> <a href='https://lucaria-academy.github.io/Animate-X/'><img src='https://img.shields.io/badge/Project_Page-Animate--X-blue' alt='Project Page'></a> <a href='https://mp.weixin.qq.com/s/vDR4kPLqnCUwfPiBNKKV9A'><img src='https://badges.aleen42.com/src/wechat.svg'></a> <a href='https://huggingface.co/Shuaishuai0219/Animate-X'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow'></a> <br> <b></a>Ant Group &nbsp; | &nbsp; </a>Tongyi Lab </b> <br> </p> </p>

This repository is the official implementation of paper "Animate-X: Universal Character Image Animation with Enhanced Motion Representation". Animate-X is a universal animation framework based on latent diffusion models for various character types (collectively named X), including anthropomorphic characters.

<table align="center"> <tr> <td> <img src="https://github.com/user-attachments/assets/fb2f4396-341f-4206-8d70-44d8b034f810"> </td> </tr> </table>

📌 Updates

<!-- <video controls loop src="https://cloud.video.taobao.com/vod/vs4L24EAm6IQ5zM3SbN5AyHCSqZIXwmuobrzqNztMRM.mp4" muted="false"></video> -->

🌄 Gallery

Introduction

<table class="center"> <tr> <td width=47% style="border: none"> <video controls loop src="https://github.com/user-attachments/assets/085b70c4-cb68-4ac1-b45f-ed7f1c75bd5c" muted="false"></video> </td> <td width=53% style="border: none"> <video controls loop src="https://github.com/user-attachments/assets/f6275c0d-fbca-43b4-b6d6-cf095723729e" muted="false"></video> </td> </tr> </table>

Animations produced by Animate-X

<table class="center"> <tr> <td width=50% style="border: none"> <video controls loop src="https://github.com/user-attachments/assets/732a3445-2054-4e7b-9c2d-9db21c39771e" muted="false"></video> </td> <td width=50% style="border: none"> <video controls loop src="https://github.com/user-attachments/assets/f25af02c-e5be-4cab-ae64-c9e0b392643a" muted="false"></video> </td> </tr> </table>

🚀 Installation

Install with conda:

conda env create -f environment.yaml
conda activate animate-x

🚀 Download Checkpoints

Download Animate-X checkpoints and put all files in checkpoints dir, which should be like:

./checkpoints/
|---- animate-x.pth 
|---- dw-ll_ucoco_384.onnx
|---- open_clip_pytorch_model.bin
|---- v2-1_512-ema-pruned.ckpt
└---- yolox_l.onnx

💡 Inference

The default inputs are a image (.jpg) and a dance video (.mp4). The default output is a 32-frame video (.mp4) with 768x512 resolution, which will be saved in ./results dir.

  1. pre-process the video.
    python process_data.py \
        --source_video_paths data/videos \
        --saved_pose_dir data/saved_pkl \
        --saved_pose data/saved_pose \
        --saved_frame_dir data/saved_frames
    
  2. run Animate-X.
    python inference.py --cfg configs/Animate_X_infer.yaml 
    

Some key parameters in the .yaml configuration file are described as follows. For example, users can adjust the max_frames or sampling interval of the dance video to generate videos of varying durations or speeds.

✔ Some tips:

Although Animate-x does not rely on strict pose alignment and we did not perform any manual alignment operations for all the results in the paper, we cannot guarantee that all cases are perfect. Therefore, users can perform handmade pose alignment operations themselves, e.g, applying the overall x/y translation and scaling on the pose skeleton of each frame to align with the position of the subject in the reference image. (put in data/saved_pose)

📧 Acknowledgement

Our implementation is based on UniAnimate, MimicMotion, and MusePose. Thanks for their remarkable contribution and released code! If we missed any open-source projects or related articles, we would like to complement the acknowledgement of this specific work immediately.

⚖ License

This repository is released under the Apache-2.0 license as found in the LICENSE file.

📚 Citation

If you find this codebase useful for your research, please use the following entry.

@article{AnimateX2025,
  title={Animate-X: Universal Character Image Animation with Enhanced Motion Representation},
  author={Tan, Shuai and Gong, Biao and Wang, Xiang and Zhang, Shiwei and Zheng, Dandan and Zheng, Ruobing and Zheng, Kecheng and Chen, Jingdong and Yang, Ming},
  journal={arXiv preprint arXiv:2410.10306},
  year={2025}
}

@article{Mimir2025,
  title={Mimir: Improving Video Diffusion Models for Precise Text Understanding},
  author={Tan, Shuai and Gong, Biao and Feng, Yutong and Zheng, Kecheng and Zheng, Dandan and Shi, Shuwei and Shen, Yujun and Chen, Jingdong and Yang, Ming},
  journal={arXiv preprint arXiv:2412.03085},
  year={2025}
}