Home

Awesome

Single-Shot Motion Completion with Transformer

:point_right:[Preprint]:point_left:

<img src="images/pipeline.png" width="600px" alt="pipline" />

<u>Abstract</u>

Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications. For different motion completion scenarios (in-betweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieves a new state of the art accuracy under multiple evaluation settings. Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem. Our method consists of two modules - a standard transformer encoder with self-attention that learns long-range dependencies of input motions, and a trainable mixture embedding module that models temporal information and discriminates key-frames. Our method can run in a non-autoregressive manner and predict multiple missing frames within a single forward propagation in real time. We finally show the effectiveness of our method in music-dance applications.

<u>State-of-the-art on Lafan1 dataset</u>

With the help of Transformer, we achieve a new SOTA result on Lafan1 dataset.

Lengths = 30L2QL2PNPSS
Zero-Vel1.516.600.2318
Interp.0.982.320.2013
ERD-QV0.691.280.1328
Ours0.611.100.1222

<img src="images/lafan1_demo_000.gif" width="300px" alt="demo0" /> <img src="images/lafan1_demo_001.gif" width="300px" alt="demo1" />

<img src="images/lafan1_demo_002.gif" width="300px" alt="demo2" /> <img src="images/lafan1_demo_003.gif" width="300px" alt="demo3" />

<img src="images/lafan1_demo_004.gif" width="300px" alt="demo4" /> <img src="images/lafan1_demo_005.gif" width="300px" alt="demo5" />

<u>Dance Infilling on Anidance Dataset</u>

We also evaluate our method on the Anidance dataset:

(From Left to Right: Ours, Interp. and Ground Truth)

<img src="images/anidance_demo_test_00.gif" width="500px" alt="demo6" /> <img src="images/anidance_demo_test_01.gif" width="500px" alt="demo7" />

(From Left to Right: Ours, Interp. and Ground Truth)

<img src="images/anidance_demo_random_00.gif" width="500px" alt="demo8" /> <img src="images/anidance_demo_random_01.gif" width="500px" alt="demo9" />

<u>Dance blending</u>

Our method can also work on complex dance movement completion:

<img src="images/blending_00.gif" width="600px" alt="demo10" /> <img src="images/blending_01.gif" width="600px" alt="demo11" />

<u>Code</u>

Coming soon

<u>Citation</u>

@misc{duan2021singleshot,
      title={Single-Shot Motion Completion with Transformer}, 
      author={Yinglin Duan and Tianyang Shi and Zhengxia Zou and Yenan Lin and Zhehui Qian and Bohan Zhang and Yi Yuan},
      year={2021},
      eprint={2103.00776},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}