C3D model is implemented in PyTorch (1.12.1) and PyTorch Lightning (2.0.8). Currently, the code supports training on UCF101.
The code was tested with Python 3.9.17 and Anaconda. To install the dependencies, run:
conda env create -f environment.yml
conda activate c3d
Create a directory called dataset
and download the UCF101 dataset.
mkdir dataset
cd dataset
To download the UCF101 dataset, run:
wget https://www.crcv.ucf.edu/data/UCF101/UCF101.rar --no-check-certificate
Make sure that dataset directory has the following structure:
UCF-101
├── ApplyEyeMakeup
│ ├── v_ApplyEyeMakeup_g01_c01.avi
│ └── ...
├── ApplyLipstick
│ ├── v_ApplyLipstick_g01_c01.avi
│ └── ...
└── Archery
│ ├── v_Archery_g01_c01.avi
│ └── ...
After run the tests to make sure everything is working:
pytest -q test/test.py
It will take a while to run the tests, because it will preprocess the dataset.
Create a directory called models
, download the pre-trained model and put it inside the models
directory.
To train the model, run:
python train.py
Usage: train.py [OPTIONS]
Options:
--dataset TEXT This is the dataset name.
--epochs INTEGER This is the number of epochs.
--test This is the test flag.
--snapshot_interval INTEGER This is the snapshot interval.
--batch_size INTEGER This is the batch size.
--lr FLOAT This is the learning rate.
--num_workers INTEGER This is the number of workers.
--clip_len INTEGER This is the clip length.
--preprocess BOOLEAN This is the preprocess flag.
--pretrained TEXT This is the pretrained model path.
--root_dir TEXT This is the root directory of the dataset.
--output_dir TEXT This is the output directory.
--device TEXT This is the device.
--seed INTEGER This is the seed.
--wandb_log This is the wandb flag.
--checkpoint TEXT This is the checkpoint path.
--help Show this message and exit.
The training support logging with WandB. To enable logging, run:
python train.py --wandb_log
To test the model, run:
python train.py --test --epochs 0 --pretrained <path_to_pretrained_model>
To infer the model, run:
python inference.py
Usage: inference.py [OPTIONS]
Options:
--video TEXT This is the video path.
--output TEXT This is the output folder.
--device TEXT This is the device to be used.
-m TEXT This is the model path.
--classes TEXT This is the classes path.
--help Show this message and exit.
The model was trained for
@inproceedings{tran2015learning,
title={Learning spatiotemporal features with 3d convolutional networks},
author={Tran, Du and Bourdev, Lubomir and Fergus, Rob and Torresani, Lorenzo and Paluri, Manohar},
booktitle={Proceedings of the IEEE international conference on computer vision},
pages={4489--4497},
year={2015}
}