This repository provides a library for unconditional motion synthesis from diverse data, as well as applications including interpolation and semantic editing in the latent space, inversion, and code for quantitative evaluation. It is based on our work MoDi: Unconditional Motion Synthesis from Diverse Data.
The library is still under development.
This code has been tested under Ubuntu 16.04 and Cuda 10.2. Before starting, please configure your Conda environment by
conda env create --name MoDi --file environment.yaml
conda activate MoDi
or by
conda create -n MoDi python=3.6.10
conda activate MoDi
conda install -y pytorch==1.5.0 torchvision==0.6.0 -c pytorch
conda install -y -c conda-forge tqdm
conda install -y -c conda-forge opencv
conda install -y -c conda-forge matplotlib
conda install -y -c anaconda pandas
conda install -y -c conda-forge scipy
conda install -y -c anaconda scikit-learn
pip install clearml
We provide a pretrained model for motion synthesis.
Download the pretrained model.
Create a folder by the name data
and place the downloaded model in it.
We use the Mixamo dataset to train our model. You can download our preprocessed data from
Google Drive into the data
folder.
Unzip it using gunzip
.
In order to know which Mixamo motions are held in the motions file, you may download the naming data from Google Drive as well. The naming data is a text file containing the character name, motion name, and path related info. In the given data, all motions are related to the character Jasper.
The action recognition model used for evaluation is trained using the joint location data. You can download the Mixamo preprocessed joint location data from
Google Drive.
You may also download the corresponding naming data from
Google Drive.
Here is an example for the creation of 18 random samples, to be placed in <result path>
.
python generate.py --type sample --motions 18 --ckpt ./data/ckpt.pt --out_path <results path> --path ./data/edge_rot_data.npy
Following is a training example with the command line arguments that were used for training our best performing model.
python train.py --path ./data/edge_rot_data.npy --skeleton --conv3fast --glob_pos --v2_contact_loss --normalize --use_velocity --foot --name <experiment name>
After downloading a new character dataset, you can add it to the file utils/config.yaml
with
your new character name and the joints you wish to use, and add the flag --character <your character>
to the train command.
Here is an example for the creation of 3 pairs of interpolated motions, with 5 motions in each interpolation sequence, to be placed in <result path>
.
python generate.py --type interp --motions 5 --interp_seeds 12-3330,777,294-3 --ckpt ./data/ckpt.pt --out_path <results path> --path ./data/edge_rot_data.npy
The parameter interp_seeds is of the frame <from_1[-to_1],from_2[-to_2],...>
.
It is a list of comma separated from-to
numbers,
where each from/to is a number representing the seed of the random z that creates a motion.
This seed is part of the file name of synthesised motions. See Novel Motion Synthesis.
The -to
is optional, and if it is not given, then our code interpolates the latent value of from
to the average latent space value, aka truncation.
In the example above, the first given pair of seeds will induce an interpolation between the latent values related to the seeds 12 and 3330,
and the second will induce an interpolation between the latent value related to the seeds 777 and to the mean latent value.
Following is an example for editing the gradual right arm lifting
and right arm elbow angle
attributes.
python latent_space_edit.py --model_path ./data/ckpt.pt --attr r_hand_lift_up r_elbow_angle --path ./data/edge_rot_data.npy
Note that editing takes a long time, but once it is done, the data that was already produced can be reused, which significantly shortens the running time. See inline documentation for more details.
Following is and example for inversion of a motion using an optimizer:
python inverse_optim.py --ckpt ./data/ckpt.pt --out_path <results path> --target_idx 32 --path ./data/edge_rot_data.npy
Use --target_idx
to chose the index of a motion to invert from edge_rot_data.npy
file.
The encoder is able to invert motions into MoDi's latent space, enabling many applications.
Download the pretrained encoder.
The encoder enables the following applications: Inversion, Motion Fusion, Spatial Editing, Denoising, Prediction from Prefix.
Download the encoder test data and place it under the data
folder. Then, run the following:
python generate_encoder.py --path ./data/test_edge_rot_data.npy --application inversion --ckpt_existing <pretrained_model_path> --ckpt <pretrained_encoder_path> --model_name <model name> --eval_id 34,54 --out_path <save_path>
Arguments:
--application
can be one of the following:[inversion, fusion, editing, editing_seed, denoising, auto_regressive]
--ckpt_existing
is given a path to the pretrained model.--ckpt
is given a path to the pretrained encoder.--eval_id
is a comma seperated list of indices from the test set that the encoder will be applied on. In the case ofediting_seed
application,eval_id
is used as a seed for a generated motion.
To train the encoder from scratch download the train and test data, place them under the data
folder. Then, run the following command:
python train_encoder.py --ckpt_existing=<pretrained_model_path> --name <experiment name> --path=./data/train_edge_rot_data.npy --n_latent_predict=2 --action_recog_model=evaluation/checkpoint_0300_globpos_acc_0.99.pth.tar --n_frames=0
Arguments:
--ckpt_existing
is given a path to the pretrained model.
The following metrics are computed during the evaluation: FID, KID, diversity, precision and recall.
you can use the --fast
argument to skip the precision and recall calculation which may take a few minutes.
Here is an example of running evaluation for a model saved in <model_ckpt>
, where <dataset>
is the dataset the model was trained with and <gt_data>
is the path to the data the action recognition model was trained with.
--motions
is the number of motions that will be generated by the model for the evaluation.
python evaluate.py --ckpt <model_ckpt> --path <dataset> --act_rec_gt_path <gt_data>
We use figures for fast and easy visualization. Since they are static, they cannot reflect smoothness and naturalness of motion, hence we recommend using bvh visualization, detailed in the next paragraph. The following figures are generated during the different runs and can be displayed with any suitable app:
- Motion synthesis and interpolation: file
generated.png
is generated in the folder given by the argument--output_path
- Training: files real_motion_
<iteration number>
.png and fake_motion_{}.png are generated in the folderimages
under the folder given by the argument--output_path
.
Basic acquaintance of Blender is expected from the reader.
Edit the file blender/import_multiple_bvh.py
, and set the values of the variables base_path
and cur_path
:
- Their concatenation should be a valid path in you file system.
- Any path containing bvh files would work. In particular, you would like to specify paths that were given as the
--output_path
arguments during motion synthesis, motion interpolation, inversion or latent space editing. cur_path
will be displayed in blender.
There are two alternatives: run the script from commandline or interactively in Blender.
blender -P blender/import_multiple_bvh.py
- Start the blender application.
- Split one of the areas and turn it into a text editor.
- Upload blender/import_multiple_bvh.py. Make sure the variables
base_path
andcur_path
are set accroding to your path, or set them now. - Run blender/import_multiple_bvh.py.
- You can interactively drag the uploaded animations and upload animations from other paths now.
Part of the code is adapted from stylegan2-pytorch.
Part of the code in models
is adapted from Ganimator.
Part of the code in Motion
is adapted from A Deep Learning Framework For Character Motion Synthesis and Editing.
Part of the code in evaluation
is adapted from ACTOR, Action2Motion, Metrics for Evaluating GANs, and Assessing Generative Models via Precision and Recall.
Part of the training examples are taken from Mixamo.
Part of the evaluation examples are taken from HumanAct12.
If you use this code for your research, please cite our paper:
@inproceedings{raab2023modi,
title={Modi: Unconditional motion synthesis from diverse data},
author={Raab, Sigal and Leibovitch, Inbal and Li, Peizhuo and Aberman, Kfir and Sorkine-Hornung, Olga and Cohen-Or, Daniel},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
pages={13873--13883},
year={2023}
}