[Paper]
This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contains codes, colab, video demos of our work.
Authors: Kelvin C.K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Nanyang Technological University
Acknowedgement: Our work is built upon MMEditing. The code will also appear in MMEditing soon. Please follow and star this repository and MMEditing!
Feel free to ask questions. I am currently working on some other stuff but will try my best to reply. If you are also interested in BasicVSR++, which is also accepted to CVPR 2022, please don't hesitate to star!
- 11 Mar 2022: Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo:
- 3 Mar 2022: Our paper has been accepted to CVPR 2022
- 4 Jan 2022: Training code released
- 2 Dec 2021: Colab demo released
- 29 Nov 2021: Test code released
- 25 Nov 2021: Initialize with video demos
The videos have been compressed. Therefore, the results are inferior to that of the actual outputs.
output.mp4
output.mp4
output.mp4
output.mp4
- Install PyTorch and torchvision following the official instructions, e.g.,
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
- Install mim and mmcv-full
pip install openmim
mim install mmcv-full
- Install mmedit
pip install mmedit
-
Download the pre-trained weights to
checkpoints/
. (Dropbox / Google Drive / OneDrive) -
Run the following command:
python inference_realbasicvsr.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${INPUT_DIR} ${OUTPUT_DIR} --max-seq-len=${MAX_SEQ_LEN} --is_save_as_png=${IS_SAVE_AS_PNG} --fps=${FPS}
This script supports both images and videos as inputs and outputs. You can simply change ${INPUT_DIR} and ${OUTPUT_DIR} to the paths corresponding to the video files, if you want to use videos as inputs and outputs. But note that saving to videos may induce additional compression, which reduces output quality.
For example:
- Images as inputs and outputs
python inference_realbasicvsr.py configs/realbasicvsr_x4.py checkpoints/RealBasicVSR_x4.pth data/demo_000 results/demo_000
- Video as input and output
python inference_realbasicvsr.py configs/realbasicvsr_x4.py checkpoints/RealBasicVSR_x4.pth data/demo_001.mp4 results/demo_001.mp4 --fps=12.5
We crop the REDS dataset into sub-images for faster I/O. Please follow the instructions below:
- Put the original REDS dataset in
./data
- Run the following command:
python crop_sub_images.py --data-root ./data/REDS --scales 4
The training is divided into two stages:
- Train a model without perceptual loss and adversarial loss using realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py.
mim train mmedit configs/realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py --gpus 8 --launcher pytorch
- Finetune the model with perceptual loss and adversarial loss using realbasicvsr_c64b20_1x30x8_lr5e-5_150k_reds.py. (You may want to replace
load_from
in the configuration file with your checkpoints pre-trained at the first stage
mim train mmedit configs/realbasicvsr_c64b20_1x30x8_lr5e-5_150k_reds.py --gpus 8 --launcher pytorch
Note: We use UDM10 with bicubic downsampling for validation. You can download it from here.
Assuming you have created two sets of images (e.g. input vs output), you can use generate_video_demo.py
to generate a video demo. Note that the two sets of images must be of the same resolution. An example has been provided in the code.
You can download the dataset using Dropbox / Google Drive / OneDrive.
@inproceedings{chan2022investigating,
author = {Chan, Kelvin C.K. and Zhou, Shangchen and Xu, Xiangyu and Loy, Chen Change},
title = {Investigating Tradeoffs in Real-World Video Super-Resolution},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
year = {2022}
}