Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders
Lucas Stoffl, Andy Bonnetto, Stéphane d'Ascoli, Alexander Mathis
École Polytechnique Fédérale de Lausanne (EPFL)
[2024.10] We released the code and datasets for h/BehaveMAE, Shot7M2 and hBABEL🎈
[2024.07] This work is accepted to ECCV 2024 🎉 -- see you in Milano!
[2024.06] h/BehaveMAE and Shot7M2 were presented at FENS Forum 2024
Recognizing the scarcity of large-scale hierarchical behavioral benchmarks, we create a novel synthetic basketball playing benchmark (Shot7M2). Beyond synthetic data, we extend BABEL into a hierarchical action segmentation benchmark (hBABEL).
We developed a masked autoencoder framework (hBehaveMAE) to elucidate the hierarchical nature of motion capture data in an unsupervised fashion. We find that hBehaveMAE learns interpretable latents, where lower encoder levels show a superior ability to represent fine-grained movements, while higher encoder levels capture complex actions and activities.
We developed and tested our models with python=3.9.15
, pytorch=2.0.1
, and cuda=11.7
. Other versions may also be suitable.
The easiest way to set up the environment is by using the provided environment.yml
file:
conda env create -f environment.yml
conda activate behavemae
For downloading and preparing the three benchmarks Shot7M2 (download here 🏀), hBABEL, and MABe22 we compiled detailed instructions in the datasets README.
To pre-train a model on 2 GPUs:
bash scripts/shot7m2/train_hBehaveMAE.sh 2
To extract hierarchical embeddings after training and evaluate these embeddings:
bash scripts/shot7m2/test_hBehaveMAE.sh
We provide a collection of pre-trained models on zenodo that were reported in our paper, allowing you to reproduce our results:
Method | Dataset | Checkpoint |
---|---|---|
hBehaveMAE | Shot7M2 | checkpoint |
hBehaveMAE | hBABEL | checkpoint |
hBehaveMAE | MABe22 | checkpoint |
If you think this project is helpful, please feel free to leave a star⭐️ and cite our paper:
@article{stoffl2024elucidating,
title={Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders},
author={Stoffl, Lucas and Bonnetto, Andy and d'Ascoli, Stephane and Mathis, Alexander},
journal={bioRxiv},
pages={2024--08},
year={2024},
publisher={Cold Spring Harbor Laboratory}
}
@inproceedings{stoffl2025elucidating,
title={Elucidating the hierarchical nature of behavior with masked autoencoders},
author={Stoffl, Lucas and Bonnetto, Andy and d’Ascoli, St{\'e}phane and Mathis, Alexander},
booktitle={European Conference on Computer Vision},
pages={106--125},
year={2025},
organization={Springer}
}
We thank the authors of the following repositories for their amazing work, on which part of our code is based:
- Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
- Evaluator code for MABe 2022 Challenge
This repository is licensed under two different licenses depending on the codebase:
- Apache 2.0 License: The majority of the project, including all original code and modifications.
- CC BY-NC 4.0 License: The code inside the
hierAS-eval/
directory is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. This means it cannot be used for commercial purposes.
Please refer to the respective LICENSE
file in the root of the repository and in hierAS-eval/
for more details.