Awesome
<p align="center"> <img width="55%" alt="VideoSys" src="./assets/figures/logo.png?raw=true"> </p> <h3 align="center"> An easy and efficient system for video generation </h3> <p align="center">| <a href="https://github.com/NUS-HPC-AI-Lab/VideoSys?tab=readme-ov-file#installation">Quick Start</a> | <a href="https://github.com/NUS-HPC-AI-Lab/VideoSys?tab=readme-ov-file#usage">Supported Models</a> | <a href="https://github.com/NUS-HPC-AI-Lab/VideoSys?tab=readme-ov-file#acceleration-techniques">Accelerations</a> | <a href="https://discord.gg/WhPmYm9FeG">Discord</a> | <a href="https://oahzxl.notion.site/VideoSys-News-42391db7e0a44f96a1f0c341450ae472?pvs=4">Media</a> | <a href="https://huggingface.co/VideoSys">HuggingFace Space</a> | </p>Allegro
We release Allegro multi-card inference demo here. VideoSys support context-parallel inference and Pyramid Attention Broadcast (PAB) which remarkably reduce the inference time. With 8xH800, Allegro can generate a 100 step 88 frames 720p(720x1280) video in 3 minutes. Further with PAB, the generate time can be reduced to 2 minutes.
Note that Allegro only support context-parallel with 8,4,2 cards. The context parallel number should be the factor of attention head dim (24) and context-length (79,200). You just need to pass the context-parallel num in --num-gpus
. For more details please refer to examples/allegro/sample.py
.
See model weights and original github here.
Quick start
- Install requirements with VideoSys-the original repo's guidlines.
- run inference with
python examples/allegro/sample.py
For the consistency with the original repo, I hard-coded all the params. You can customize prompt, steps and other params in examples/allegro/sample.py
.
Thanks again for the original author of VideoSys!
Latest News 🔥
- [2024/09] Support CogVideoX, Vchitect-2.0 and Open-Sora-Plan v1.2.0.
- [2024/08] 🔥 Evole from OpenDiT to <b>VideoSys: An easy and efficient system for video generation</b>.
- [2024/08] 🔥 Release PAB paper: <b>Real-Time Video Generation with Pyramid Attention Broadcast</b>.
- [2024/06] 🔥 Propose Pyramid Attention Broadcast (PAB)[paper][blog][doc], the first approach to achieve <b>real-time</b> DiT-based video generation, delivering <b>negligible quality loss</b> without <b>requiring any training</b>.
- [2024/06] Support Open-Sora-Plan and Latte.
- [2024/03] 🔥 Propose Dynamic Sequence Parallel (DSP)[paper][doc], achieves 3x speed for training and 2x speed for inference in Open-Sora compared with sota sequence parallelism.
- [2024/03] Support Open-Sora.
- [2024/02] 🎉 Release OpenDiT: An Easy, Fast and Memory-Efficent System for DiT Training and Inference.
About
VideoSys is an open-source project that provides a user-friendly and high-performance infrastructure for video generation. This comprehensive toolkit will support the entire pipeline from training and inference to serving and compression.
We are committed to continually integrating cutting-edge open-source video models and techniques. Stay tuned for exciting enhancements and new features on the horizon!
Installation
Prerequisites:
- Python >= 3.10
- PyTorch >= 1.13 (We recommend to use a >2.0 version)
- CUDA >= 11.6
We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples:
conda create -n videosys python=3.10 -y
conda activate videosys
Install VideoSys:
git clone https://github.com/NUS-HPC-AI-Lab/VideoSys
cd VideoSys
pip install -e .
Usage
VideoSys supports many diffusion models with our various acceleration techniques, enabling these models to run faster and consume less memory.
<b>You can find all available models and their supported acceleration techniques in the following table. Click Code
to see how to use them.</b>
You can also find easy demo with HuggingFace Space <a href="https://huggingface.co/VideoSys">[link]</a> and Gradio <a href="./gradio">[link]</a>.
Acceleration Techniques
Pyramid Attention Broadcast (PAB) [paper][blog][doc]
Real-Time Video Generation with Pyramid Attention Broadcast
Authors: Xuanlei Zhao<sup>1*</sup>, Xiaolong Jin<sup>2*</sup>, Kai Wang<sup>1*</sup>, and Yang You<sup>1</sup> (* indicates equal contribution)
<sup>1</sup>National University of Singapore, <sup>2</sup>Purdue University
PAB is the first approach to achieve <b>real-time</b> DiT-based video generation, delivering <b>lossless quality</b> without <b>requiring any training</b>. By mitigating redundant attention computation, PAB achieves up to 21.6 FPS with 10.6x acceleration, without sacrificing quality across popular DiT-based video generation models including Open-Sora, Latte and Open-Sora-Plan.
See its details here.
Dyanmic Sequence Parallelism (DSP) [paper][doc]
DSP is a novel, elegant and super efficient sequence parallelism for Open-Sora, Latte and other multi-dimensional transformer architecture.
It achieves 3x speed for training and 2x speed for inference in Open-Sora compared with sota sequence parallelism (DeepSpeed Ulysses). For a 10s (80 frames) of 512x512 video, the inference latency of Open-Sora is:
Method | 1xH800 | 8xH800 (DS Ulysses) | 8xH800 (DSP) |
---|---|---|---|
Latency(s) | 106 | 45 | 22 |
See its details here.
Contributing
We welcome and value any contributions and collaborations. Please check out CONTRIBUTING.md for how to get involved.
Contributors
<a href="https://github.com/NUS-HPC-AI-Lab/VideoSys/graphs/contributors"> <img src="https://contrib.rocks/image?repo=NUS-HPC-AI-Lab/VideoSys"/> </a>Star History
Citation
@misc{videosys2024,
author={VideoSys Team},
title={VideoSys: An Easy and Efficient System for Video Generation},
year={2024},
publisher={GitHub},
url = {https://github.com/NUS-HPC-AI-Lab/VideoSys},
}
@misc{zhao2024pab,
title={Real-Time Video Generation with Pyramid Attention Broadcast},
author={Xuanlei Zhao and Xiaolong Jin and Kai Wang and Yang You},
year={2024},
eprint={2408.12588},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.12588},
}
@misc{zhao2024dsp,
title={DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers},
author={Xuanlei Zhao and Shenggan Cheng and Chang Chen and Zangwei Zheng and Ziming Liu and Zheming Yang and Yang You},
year={2024},
eprint={2403.10266},
archivePrefix={arXiv},
primaryClass={cs.DC},
url={https://arxiv.org/abs/2403.10266},
}
@misc{zhao2024opendit,
author={Xuanlei Zhao, Zhongkai Zhao, Ziming Liu, Haotian Zhou, Qianli Ma, and Yang You},
title={OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference},
year={2024},
publisher={GitHub},
url={https://github.com/NUS-HPC-AI-Lab/VideoSys/tree/v1.0.0},
}