Skip to content

Codes for arXiv paper "Semi-supervised Few-shot Atomic Action Recognition".

Notifications You must be signed in to change notification settings

Sausage-SONG/Few-shot-action-recognition

Repository files navigation

Semi-supervised Few-shot Atomic Action Recognition

This repo contains the codes for our paper "Semi-supervised Few-shot Atomic Action Recognition". Please check our paper and project page for more details.

FSAA Architecture

Our learning strategies are divided into two parts: 1) train an encoder with unsupervised learning; 2) train the action classification module with supervised learning. Regarding the encoder our model provides fine-grained spatial and temporal video processing with high length flexibility, which embeds the video feature and temporally combines the features with TCN. In terms of classification module, our models provides attention pooling and compares the multi-head relation. Finally, the CTC and MSE loss enables our model for time-invariant few shot classification training.

Requirements

pytorch >= 1.5.0
torchvision >= 0.6.0
numpy >= 1.18.1
scipy >= 1.4.1
vidaug >= 0.1

Usage

Installation

  1. Clone the repo
  2. Install required packages
  3. Download trained models to <REPO_DIR>/models (Optional)
  4. Download the datasets (Optional)

Training

As mentioned in the intro, our model training has two parts.

1. Train the encoder unsupervisedly.

Here we use MoCo. However, this part can be done by actually any unsupervised learning tool.

First clone MoCo. Then do the following copy & replace:

cp '<REPO_DIR>/moco/builder.py' '<MOCO_DIR>/moco/'
cp '<REPO_DIR>/moco/{dataset.py,encoder.py,main_moco.py,moco_encoder.py,rename.py,tcn.py}' '<MOCO_DIR>/'

You are recommended to first read the instruction of MoCo to know more about how it works, then input the relevant paths to main_moco.py and start your training. You will need to use rename.py to split the trained model (a .tar file) to a c3d.pkl and tcn.pkl for the next step.

2. Train the whole model supervisedly.

Load your pretrained C3D and TCN models and continue.
python3 train.py -d='./splits/<YOUR_DATASET>.json' -n='<EXP_NAME>'

Testing

python3 test.py -d='./splits/<YOUR_DATASET>.json' -c='<CHECKPOINT_DIR>'

Trained Models

TODO

Datasets

We use three atomic action datasets.

  1. HAA
  2. Finegym
  3. MIT

Dataset splits and json files can be found under <REPO_DIR>/splits, see example dataset jsons or use the scripts there to generate your own. If you want to use other datasets, make sure it has a <DATASET>/<SPLIT>/<CLASS>/<VIDEO>/<FRAME> structure.

Acknowledge

This repo makes use of some great works. Our appreciation for

  1. locuslab / TCN
  2. fujenchu / relationNet
  3. facebookresearch / moco
  4. parlance / ctcdecode
  5. okankop / vidaug

Reference

Please refer this paper as

@Article{fsaa,
  author  = {Sizhe Song, Xiaoyuan Ni, Yu-Wing Tai, Chi-Keung Tang},
  title   = {Semi-supervised Few-shot Atomic Action Recognition},
  journal = {arXiv preprint arXiv:2011.08410},
  year    = {2020},
}

About

Codes for arXiv paper "Semi-supervised Few-shot Atomic Action Recognition".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages