Skip to content

Latest commit

 

History

History
294 lines (202 loc) · 17.7 KB

README.md

File metadata and controls

294 lines (202 loc) · 17.7 KB

Introduction

MindEditing is an open-source toolkit based on MindSpore, containing the most advanced image and video task models from open-source or Huawei Technologies Co. , such as IPT, FSRCNN, BasicVSR and other models. These models are mainly used for low-level vision task, such as Super-Resolution, DeNoise, DeRain, Inpainting. MindEditing also supports many platforms, including CPU/GPU/Ascend.Of course, you'll get an even better experience on the Ascend.

Some Demos:

  • Video super-resolution demo
Video_SR_Demo-1.-.Trim.mp4
  • Video frame Interpolation demo
Video_frame_Interpolation_Demo.mp4
Main features
  • Easy to use

    We take the unified entry, you just specify the supported model name and configure the parameters in the parameter yaml file to start your task.

  • Support multiple tasks

    MindEditing supports a variety of popular and contemporary tasks such as deblurring, denoising, super-resolution, and inpainting.

  • SOTA

    MindEditing provides state-of-the-art algorithms in deblurring, denoising, super-resolution, and inpainting tasks.

Multi-Task

With so many tasks, is there a model that can handle multiple tasks? Of course, the pre-trained model, namely, image processing transformer (IPT).The IPT model is a new pre-trained model,it is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after finetuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

Excellent performance
  • Compared with the state-of-the-art image processing models on different tasks, the IPT model performs better
  • Brush multiple low-level visual tasks

    Compared with the state-of-the-art methods, the IPT model achieve the best performance.

  • Generalization Ability

    Generation ability(table 4) of the IPT model on color image denoising with different noise levels.

  • The performance of CNN and IPT models using different percentages of data

    When the pre-training data is limited, the CNN model can obtain better performance. With the increase of data volume, the IPT model based on Transformer module gains significant performance improvement, and the curve(table 5) trend also shows the promising potential of the IPT model.

  • Amazing actual image inference results

    • Image Super-resolution task

    The figure below shows super-resolution results with bicubic downsampling (×4) from Urban100. The proposed IPT model recovers more details.

    • Image Denoising task

    It must be pointed out that IPT won CVPR2023 NTIRE Image Denoising track champion.

    The figure below shows color image denoising results with noise level σ = 50.

    • Image Deraining task

    The figure below shows image deraining results on the Rain100L dataset.

Dependency

  • mindspore >=1.9
  • numpy =1.19.5
  • scikit-image =0.19.3
  • pyyaml =5.1
  • pillow =9.3.0
  • lmdb =1.3.0
  • h5py =3.7.0
  • imageio =2.25.1
  • munch =2.5.0

Python can be installed by Conda.

Install Miniconda:

cd /tmp
curl -O https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py37_4.10.3-Linux-$(arch).sh
bash Miniconda3-py37_4.10.3-Linux-$(arch).sh -b
cd -
. ~/miniconda3/etc/profile.d/conda.sh
conda init bash

Create a virtual environment, taking Python 3.7.5 as an example:

conda create -n mindspore_py37 python=3.7.5 -y
conda activate mindspore_py37

Check the Python version.

python --version

To install the dependency, please run:

pip install -r requirements.txt

MindSpore(>=1.9) can be easily installed by following the official instruction where you can select your hardware platform for the best fit. To run in distributed mode, openmpi is required to install.

Get Started

we provide the boot file of training and validation, chose different model config to start.Please see the document for more basic usage of MindEditing.

python3 train.py --config_path ./configs/basicvsr/train.yaml
# or
python3 val.py --config_path ./configs/basicvsr/val.yaml
  • Graph Mode and Pynative Mode

    Graph mode is optimized for efficiency and parallel computing with a compiled static graph. In contrast, pynative mode is optimized for flexibility and easy development. You may alter the parameter system.context_mode in model config file to switch to pure pynative mode for development purpose.

News

MindEditing currently has a branch of 0.x, but it will have a branch of 1.x in the future. You'll find more features in the 1.x branch, so stay tuned.


  • April 6, 2023

    The model (MPFER) of Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations is coming soon, Stay tuned.

  • March 15, 2023

    The inference codes and demos of Tunable Conv had already been joined as test case, you can find them in ./tests/. Besides, the training codes are coming soon. The Tunable Conv has 4 models for demo, NAFNet for modulated image denoising, SwinIR for modulated image denoising and perceptual super-resolution, EDSR for modulated joint image denoising and deblurring and StyleNet for modulated style transfer.

Parallel Performance

Increasing the number of parallel work can speed up the training speed. The following is the experiment of example model on CPU 16-core GPU 2xP100:

num_parallel_workers: 8
epoch 1/100 step 1/133, loss = 0.045729052, duration_time = 00:01:07, step_time_avg = 0.00 secs, eta = 00:00:00
epoch 1/100 step 2/133, loss = 0.027709303, duration_time = 00:01:20, step_time_avg = 6.66 secs, eta = 1 day(s) 00:36:02
epoch 1/100 step 3/133, loss = 0.027135072, duration_time = 00:01:33, step_time_avg = 8.74 secs, eta = 1 day(s) 08:17:56

num_parallel_workers: 16
epoch 1/100 step 1/133, loss = 0.04535071, duration_time = 00:00:47, step_time_avg = 0.00 secs, eta = 00:00:00
epoch 1/100 step 2/133, loss = 0.032363698, duration_time = 00:01:00, step_time_avg = 6.74 secs, eta = 1 day(s) 00:54:38
epoch 1/100 step 3/133, loss = 0.02718924, duration_time = 00:01:13, step_time_avg = 8.83 secs, eta = 1 day(s) 08:36:07

Tutorials

The following tutorials are provided to help users learn to use Mindediting.

Model List

model_name task Conference Support platform Download
IPT Multi-Task CVPR 2021 Ascend/GPU ckpt
BasicVSR Video Super Resolution CVPR 2021 Ascend/GPU ckpt
BasicVSR++Light Video Super Resolution CVPR 2022 Ascend/GPU ckpt
NOAHTCV Image DeNoise CVPR 2021(MAI Challenge) Ascend/GPU ckpt
RRDB Image Super Resolution ECCVW, 2018 Ascend/GPU ckpt
FSRCNN Image Super Resolution ECCV 2016 Ascend/GPU ckpt
SRDiff Image Super Resolution Neurocomputing 2022 Ascend/GPU ckpt
VRT Multi-Task arXiv(2022.01) Ascend/GPU ckpt
RVRT Multi-Task arXiv(2022.06) Ascend/GPU ckpt
TTVSR Video Super Resolution CVPR 2022 Ascend/GPU ckpt
MIMO-Unet Image DeBlur ICCV 2021 Ascend/GPU ckpt
NAFNet Image DeBlur arXiv(2022.04) Ascend/GPU ckpt
CTSDG Image InPainting ICCV 2021 Ascend/GPU ckpt
EMVD Video Denoise CVPR 2021 Ascend/GPU ckpt
Tunable_Conv tunable task(image process) arXiv(2023.04) Ascend/GPU ckpt
IFR+ Video Frame Interpolation CVPR 2022 Ascend/GPU ckpt
MPFER 3D-based Multi-Frame Denoising(is coming soon) arXiv(2023.04) GPU ckpt

Download: The model files are available in.ckpt and.OM formats, and you can download the corresponding files to carry out your research work.

  • The.ckpt file can be downloaded by clicking the corresponding link in thedownloadcolumn of the table above.
  • The.om file can be found here. For details about how to use the.om file, see deploy.
  • The multi-task model can be downloaded according to the task division of the corresponding model files, the selection of model files refer to the yaml file of the different models in the configs folder.
  • For models that require spynet or vgg pretrained weights, they can also be downloaded in the corresponding models link.

Please refer to ModelZoo Homepage or the documentation under the folder docsfor more details on the model.

License

This project follows the Apache License 2.0 open-source license.

Feedbacks and Contact

The dynamic version is still under development, if you find any issue or have an idea on new features, please don't hesitate to contact us via issue.

Acknowledgement

MindSpore is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new computer vision methods.

If you find MindEditing useful in your research, please consider to cite the following related papers:

@misc{MindEditing 2022,
    title={{MindEditing}:MindEditing for low-level vision task},
    author={MindEditing},
    howpublished = {\url{https://github.com/mindspore-lab/mindediting}},
    year={2022}
}

Projects in MindSpore-Lab

  • MindCV:A toolbox of vision models and algorithms based on MindSpore.
  • MindNLP:An opensource NLP library based on MindSpore.
  • MindDiffusion:A collection of diffusion models based on MindSpore.
  • MindFace:MindFace is an open source toolkit based on MindSpore, containing the most advanced face recognition and detection models, such as ArcFace, RetinaFace and other models.
  • MindAudio:An open source all-in-one toolkit for the voice field based on MindSpore.
  • MindOCR:A toolbox of OCR models, algorithms, and pipelines based on MindSpore.
  • MindRL:A high-performance, scalable MindSpore reinforcement learning framework.
  • MindREC:MindSpore large-scale recommender system library.
  • MindPose:an open-source toolbox for pose estimation based on MindSpore.