This repository holds the codebase and dataset for the project:
Spatial Temporal Graph Convolutional Networks for the Recognition of Quick Human Actions
- Python3 (>3.5)
- PyTorch
We Experimented on the 3D Skeletal Data of NTU-RGB+D.
The pre-processed data can be downloaded from
GoogleDrive.
After downloading the data, extract the "NTU-RGB-D" folder into path.
To create a dataset of fast actions, we downsample the NTU-RGB+D dataset.
The downsampling is done by taking one frame and leaving another, halving the number of frames.
Run "downsample.py" to downsample the desired data.
We provide "create_small_data.py" that creates a smaller data from the original data by selecting a number of actions out of all 60 actions. The desired actions can be selected in the code based on their labels on the NTU-RGB+D website.
We provide visualization of the 3D skeletal data of NTU-RGB+D on MATLAB.
More details can be found on the "visualize" folder.
A model can be trained by running "main.py". The results will show in the "results" folder.
In case of using a smaller data, some modifications to the code are needed, they're detailed in the code.
Some results of different experiments are shown here:
Model | Temporal Kernel Size | Downsampled NTU-RGB+D (60 actions) |
Downsampled NTU-RGB+D (10 actions) |
---|---|---|---|
Model I (ST-GCN) [1] | 9 | 86.02% | 93.39% |
Model II (Proposed) | 9 | 85.59% | 94.01% |
Model I (ST-GCN) [1] | 13 | 86.53% | 94% |
Model II (Proposed) | 13 | 84.7% | 93.29% |
[1] Sijie Yan et al., 2018. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition.