Home

Awesome

MoDi: Unconditional Motion Synthesis from Diverse Data

<!-- ![alt text](coming_soon.png) -->

Python Pytorch

This repository provides a library for unconditional motion synthesis from diverse data, as well as applications including interpolation and semantic editing in the latent space, inversion, and code for quantitative evaluation. It is based on our work MoDi: Unconditional Motion Synthesis from Diverse Data.

<div> <span class="center"> <img src="https://github.com/sigal-raab/sigal-raab.github.io/tree/main/images/jump_teaser.gif" style="width: 20vw;"> <img src="images/walk_teaser.gif" style="width: 20vw;"> <img src="images/dance_teaser.gif" style="width: 20vw;"> <img src="images/salto_teaser.gif" style="width: 20vw;"> </span> </div>

The library is still under development.

Prerequisites

This code has been tested under Ubuntu 16.04 and Cuda 10.2. Before starting, please configure your Conda environment by

conda env create --name MoDi --file environment.yaml
conda activate MoDi

or by

conda create -n MoDi python=3.6.10
conda activate MoDi
conda install -y pytorch==1.5.0 torchvision==0.6.0 -c pytorch
conda install -y -c conda-forge tqdm
conda install -y -c conda-forge opencv
conda install -y -c conda-forge matplotlib
conda install -y -c anaconda pandas
conda install -y -c conda-forge scipy
conda install -y -c anaconda scikit-learn
pip install clearml

Data

Pretrained Model

We provide a pretrained model for motion synthesis.

Download the pretrained model.

<!-- Download the [pretrained model](https://drive.google.com/file/d/1Ezd6fW33B4GjmycKPVO-s0yIBfnqu9T5/view?usp=sharing). -->

Create a folder by the name data and place the downloaded model in it.

Data for a Quick Start

We use the Mixamo dataset to train our model. You can download our preprocessed data from Google Drive into the data folder. Unzip it using gunzip.

In order to know which Mixamo motions are held in the motions file, you may download the naming data from Google Drive as well. The naming data is a text file containing the character name, motion name, and path related info. In the given data, all motions are related to the character Jasper.

<!--In order to use motions for other characters in Mixamo, or for other datasets, please refer to our instructions in the [More Data]() section. <span style="color: red"> TBC </span>-->

The action recognition model used for evaluation is trained using the joint location data. You can download the Mixamo preprocessed joint location data from
Google Drive. You may also download the corresponding naming data from Google Drive.

Novel motion synthesis

Here is an example for the creation of 18 random samples, to be placed in <result path>.

python generate.py --type sample --motions 18 --ckpt ./data/ckpt.pt --out_path <results path> --path ./data/edge_rot_data.npy

Train from scratch

Following is a training example with the command line arguments that were used for training our best performing model.

python train.py --path ./data/edge_rot_data.npy --skeleton --conv3fast --glob_pos --v2_contact_loss --normalize --use_velocity --foot --name <experiment name>

Training on a new character.

After downloading a new character dataset, you can add it to the file utils/config.yaml with your new character name and the joints you wish to use, and add the flag --character <your character> to the train command.

Interpolation in Latent Space

Here is an example for the creation of 3 pairs of interpolated motions, with 5 motions in each interpolation sequence, to be placed in <result path>.

python generate.py --type interp --motions 5 --interp_seeds 12-3330,777,294-3 --ckpt ./data/ckpt.pt --out_path <results path> --path ./data/edge_rot_data.npy

The parameter interp_seeds is of the frame <from_1[-to_1],from_2[-to_2],...>. It is a list of comma separated from-to numbers, where each from/to is a number representing the seed of the random z that creates a motion. This seed is part of the file name of synthesised motions. See Novel Motion Synthesis. The -to is optional, and if it is not given, then our code interpolates the latent value of from to the average latent space value, aka truncation. In the example above, the first given pair of seeds will induce an interpolation between the latent values related to the seeds 12 and 3330, and the second will induce an interpolation between the latent value related to the seeds 777 and to the mean latent value.

Semantic Editing in Latent Space

Following is an example for editing the gradual right arm lifting and right arm elbow angle attributes.

python latent_space_edit.py --model_path ./data/ckpt.pt --attr r_hand_lift_up r_elbow_angle --path ./data/edge_rot_data.npy 

Note that editing takes a long time, but once it is done, the data that was already produced can be reused, which significantly shortens the running time. See inline documentation for more details.

Inversion

Following is and example for inversion of a motion using an optimizer:

python inverse_optim.py --ckpt ./data/ckpt.pt --out_path <results path> --target_idx 32 --path ./data/edge_rot_data.npy

Use --target_idx to chose the index of a motion to invert from edge_rot_data.npy file.

Encoder

The encoder is able to invert motions into MoDi's latent space, enabling many applications.

Pretrained Encoder

Download the pretrained encoder.

Applications

The encoder enables the following applications: Inversion, Motion Fusion, Spatial Editing, Denoising, Prediction from Prefix.

Download the encoder test data and place it under the data folder. Then, run the following:

 python generate_encoder.py --path ./data/test_edge_rot_data.npy --application inversion --ckpt_existing <pretrained_model_path> --ckpt <pretrained_encoder_path> --model_name <model name> --eval_id 34,54 --out_path <save_path>

Arguments:

Train the Encoder

To train the encoder from scratch download the train and test data, place them under the data folder. Then, run the following command:

 python train_encoder.py --ckpt_existing=<pretrained_model_path> --name <experiment name> --path=./data/train_edge_rot_data.npy --n_latent_predict=2 --action_recog_model=evaluation/checkpoint_0300_globpos_acc_0.99.pth.tar --n_frames=0

Arguments:

Quntitative evaluation

The following metrics are computed during the evaluation: FID, KID, diversity, precision and recall. you can use the --fast argument to skip the precision and recall calculation which may take a few minutes.

Here is an example of running evaluation for a model saved in <model_ckpt>, where <dataset> is the dataset the model was trained with and <gt_data> is the path to the data the action recognition model was trained with. --motions is the number of motions that will be generated by the model for the evaluation.

python evaluate.py --ckpt <model_ckpt> --path <dataset> --act_rec_gt_path <gt_data>

Visualisation

Figures

We use figures for fast and easy visualization. Since they are static, they cannot reflect smoothness and naturalness of motion, hence we recommend using bvh visualization, detailed in the next paragraph. The following figures are generated during the different runs and can be displayed with any suitable app:

Using Blender to visualise bvh files

Basic acquaintance of Blender is expected from the reader.

Edit the file blender/import_multiple_bvh.py, and set the values of the variables base_path and cur_path:

There are two alternatives: run the script from commandline or interactively in Blender.

Running the script from commandline

blender -P blender/import_multiple_bvh.py

Interactively running the script in Blender

Acknowledgements

Part of the code is adapted from stylegan2-pytorch.

Part of the code in models is adapted from Ganimator.

Part of the code in Motion is adapted from A Deep Learning Framework For Character Motion Synthesis and Editing.

Part of the code in evaluation is adapted from ACTOR, Action2Motion, Metrics for Evaluating GANs, and Assessing Generative Models via Precision and Recall.

Part of the training examples are taken from Mixamo.

Part of the evaluation examples are taken from HumanAct12.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{raab2023modi,
  title={Modi: Unconditional motion synthesis from diverse data},
  author={Raab, Sigal and Leibovitch, Inbal and Li, Peizhuo and Aberman, Kfir and Sorkine-Hornung, Olga and Cohen-Or, Daniel},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={13873--13883},
  year={2023}
}