Skip to content

Latest commit

 

History

History
49 lines (37 loc) · 5.53 KB

MODELZOO.md

File metadata and controls

49 lines (37 loc) · 5.53 KB

Model Zoo

Pretrained Models

All pretrained models can be downloaded from Google Drive. After downloading, put them into ckpt/.

Main Results

We report our methods on Kinetics-400, Something-Something V1 and V2. All the numbers including baselines and TPN are obtained via fully-convolutional testing.

Kinetics-400

Since the number of Kinetics-400 videos is slightly different (might lead to a performance drop), we report all results on our own dataset. Our data contains 240403 training videos and 19769 validation videos which are rescaled to 240*320 resolution. Note that the trimmed time of Non-Local data and the resolution of MMAction data are different from ours. But the improvements of TPN are consistent. In order to ensure the reproduction, we will find a proper way to release our validation set. All the following results on Kinetics-400 also take flip augmentation testing (~0.1% fluctuation). We sample F frames with a stride of S frames (denote FxS).

Model Frames TPN Top-1 Weights Config
R50 8 x 8 - 74.9 link config_files/kinetics400/baseline/r50f8s8.py
R50 8 x 8 Yes 76.1 link config_files/kinetics400/tpn/r50f8s8.py
R50 16 x 4 - 76.1 link config_files/kinetics400/baseline/r50f16s4.py
R50 16 x 4 Yes 77.3 link config_files/kinetics400/tpn/r50f16s4.py
R50 32 x 2 - 75.7 link config_files/kinetics400/baseline/r50f32s2.py
R50 32 x 2 Yes 77.7 link config_files/kinetics400/tpn/r50f32s2.py
R101 8 x 8 - 76.0 link config_files/kinetics400/baseline/r101f8s8.py
R101 8 x 8 Yes 77.2 link config_files/kinetics400/tpn/r101f8s8.py
R101 16 x 4 - 77.0 link config_files/kinetics400/baseline/r101f16s4.py
R101 16 x 4 Yes 78.1 link config_files/kinetics400/tpn/r101f16s4.py
R101 32 x 2 - 77.4 link config_files/kinetics400/baseline/r101f32s2.py
R101 32 x 2 Yes 78.9 link config_files/kinetics400/tpn/r101f32s2.py

We also train our TPN on MMAction data, the performance will increase due to the raw resolution and ratio.

Model Frames TPN Top-1 Weights Config
R50 8 x 8 Yes 76.7 link config_files/kinetics400/baseline/r50f8s8.py
R101 8 x 8 Yes 78.2 link config_files/kinetics400/baseline/r101f8s8.py

All models are trained on 32 GPUs with 150 epochs. More details could be found in config_files.

Something-Something

Something-Something is a more stable benchmark and the whole data could be downloaded from their website. We report our results on both V1 and V2. All numbers are obtained by following the standard protocol i.e., 3 crops * 2 clips. TSM serves as our backbone network. Different from original repo of TSM which takes Kinetics-pretrain, our implementation is initialized by imagenet-pretrain and trained with longer schedule. We use the same hyper-parameters of training for both baseline and TPN. Therefore, the improvements come from TPN design instead of other training tricks. We take the uniform sampling for training and validation.

Model Dataset Version Frames TPN Top-1 Weights Config
TSM50 V1 8 - 48.2 link config_files/sthv1/tsm_baseline.py
TSM50 V1 8 Yes 50.7 link config_files/sthv1/tsm_tpn.py
TSM50 V2 8 - 62.3 link config_files/sthv2/tsm_baseline.py
TSM50 V2 8 Yes 64.7 link config_files/sthv2/tsm_tpn.py

If you have any problem about how to reproduce our results, please contact Ceyuan Yang (yc019@ie.cuhk.edu.hk) or Yinghao Xu (xy119@ie.cuhk.edu.hk).