Official Code of ICCV 2021 Paper: Learning to Cut by Watching Movies
[ ArXiv | Project Website | ICCV2021 ]
Learning to Cut by Watching Movies. Alejandro Pardo*, Fabian Caba Heilbron, Juan León Alcázar, Ali Thabet, Bernard Ghanem. In ICCV, 2021.
Clone the repository and move to folder:
git clone https://github.com/PardoAlejo/LearningToCut.git
cd LearningToCut
Install environmnet:
conda env create -f ltc-env.yml
Download the following resources and extract the content in the appropriate destination folder. See table.
Resource | Drive File | Destination Folder |
---|---|---|
Train Annotations | link | ./data/ |
Val Annotations | link | ./data/ |
Video Durations | link | ./data/ |
Video Features | link | ./data/ |
Audio Features | link | ./data/ |
Best Model | link | ./checkpoints/ |
If you want to extract features yourself, or you need the original videos instead, please refer to data/DATA.md
The folder structure should be as follows:
README.md
ltc-env.yml
│
├── data
│ ├── ResNexT-101_3D_video_features.h5
│ ├── ResNet-18_audio_features.h5
│ ├── subset_moviescenes_shotcuts_train.csv
│ ├── subset_moviescenes_shotcuts_val.csv
│ └── durations.csv
│
├── checkpoints
| ├── best_state.ckpt
│
└── scripts
Copy paste the following commands in the terminal.
Load environment:
conda activate ltc
cd scripts/
Inference on val set
sh inference.sh
Method | AR1-D1 | AR3-D1 | AR5-D1 | AR10-D1 | AR1-D2 | AR3-D2 | AR5-D2 | AR10-D2 | AR1-D3 | AR3-D3 | AR5-D3 | AR10-D3 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Random | 0.64% | 1.91% | 3.15% | 6.28% | 1.85% | 5.65% | 9.32% | 18.52% | 3.67% | 10.67% | 17.62% | 33.91% |
Raw | 1.16% | 3.97% | 6.36% | 11.72% | 2.51% | 8.32% | 13.15% | 24.25% | 3.73% | 12.19% | 19.33% | 34.97% |
LTC | 8.18% | 17.95% | 24.44% | 30.35% | 15.30% | 35.11% | 48.26% | 59.42% | 19.18% | 46.32% | 64.30% | 79.35% |
@InProceedings{Pardo_2021_ICCV,
author = {Pardo, Alejandro and Caba, Fabian and Alcazar, Juan Leon and Thabet, Ali K. and Ghanem, Bernard},
title = {Learning To Cut by Watching Movies},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {6858-6868}
}