Home

Awesome

VideoScore

This is the official repo for our EMNLP 2024 paper "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation".

<a target="_blank" href="https://arxiv.org/abs/2406.15252"> <img style="height:22pt" src="https://img.shields.io/badge/-Paper-red?style=flat&logo=arxiv"></a> <a target="_blank" href="https://github.com/TIGER-AI-Lab/VideoScore"> <img style="height:22pt" src="https://img.shields.io/badge/-Code-green?style=flat&logo=github"></a> <a target="_blank" href="https://tiger-ai-lab.github.io/VideoScore/"> <img style="height:22pt" src="https://img.shields.io/badge/-🌐%20Website-blue?style=flat"></a> <a target="_blank" href="https://huggingface.co/datasets/TIGER-Lab/VideoFeedback"> <img style="height:22pt" src="https://img.shields.io/badge/-πŸ€—%20Dataset-red?style=flat"></a> <a target="_blank" href="https://huggingface.co/spaces/TIGER-Lab/VideoScore"> <img style="height:22pt" src="https://img.shields.io/badge/-πŸ€—%20Demo-red?style=flat"></a> <a target="_blank" href="https://huggingface.co/TIGER-Lab/VideoScore"> <img style="height:22pt" src="https://img.shields.io/badge/-πŸ€—%20Models-red?style=flat"></a> <a target="_blank" href="https://twitter.com/DongfuJiang/status/1805438506137010326"> <img style="height:22pt" src="https://img.shields.io/badge/-Tweet-blue?style=flat&logo=twitter"></a> <br>

News

[2024-11-28] Try on our new version VideoScore-v1.1, with better performance in "text-to-video alignment" subscore and the support for 48 frames in inference now!

[2024-08-05] We released the Wandb training cruves of VideoScore and VideoScore-anno-only to help reproduce the training results.

Introduction

<video src="https://user-images.githubusercontent.com/105091430/90adfb70-fdff-4101-9207-9bd4f43aae4c.mp4"></video>

πŸš€The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. πŸ€”The main barrier is the lack of large-scale human-annotated dataset.

Installation

pip install -e . 
pip install -e .[eval] 
git clone https://github.com/TIGER-AI-Lab/Mantis
cd Mantis
pip install -e .[train,eval]
pip install flash-attn --no-build-isolation
# then training scripts are in Mantis/train/scripts

Dataset

Model

Inference examples

cd examples
python run_videoscore.py

Evaluation

For details, please check benchmark/README.md

Training

For details, please check training/README.md

Acknowledgement

Citation

@article{he2024videoscore,
  title = {VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation},
  author = {He, Xuan and Jiang, Dongfu and Zhang, Ge and Ku, Max and Soni, Achint and Siu, Sherman and Chen, Haonan and Chandra, Abhranil and Jiang, Ziyan and Arulraj, Aaran and Wang, Kai and Do, Quy Duc and Ni, Yuansheng and Lyu, Bohan and Narsupalli, Yaswanth and Fan, Rongqi and Lyu, Zhiheng and Lin, Yuchen and Chen, Wenhu},
  journal = {ArXiv},
  year = {2024},
  volume={abs/2406.15252},
  url = {https://arxiv.org/abs/2406.15252},
}