This is the Pytorch implementation for MICCAI2022 -
Rethinking Surgical Instrument Segmentation: A Background Image Can Be All You Need
An Wang*, Mobarakol Islam*, Mengya Xu, and Hongliang Ren**
*: First author; **: Corresponding author
News - Demo1 and Demo2 and corresponding google colab demo1 and demo2 are uploaded for better understanding!
In this paper, we rethink the surgical instrument segmentation task and propose a one-to-many data generation solution that gets rid of the complicated and expensive process of data collection and annotation from robotic surgery. We only utilize a single surgical background tissue image and a few open-source instrument images as the seed images and apply multiple augmentation and blending techniques to synthesize amounts of image variations. In addition, we also introduce the chained augmentation mixing during training to further enhance the data diversities. The proposed approach can achieve decent surgical instrument segmentation performance compared with real dataset. Moreover, we also observe that our method can deal with novel instruments prediction in the deployment domain. We hope our inspiring results would encourage researchers to emphasize data-centric methods to overcome demanding deep learning limitations besides data shortage, such as class imbalance, domain adaptation, and incremental learning.
- Python=3.8
- Pytorch=1.10
- torchvision=0.11.2
- cuda=11.3
- imgaug=0.4.0
- albumentations=1.1.0
- comet_ml=3.2.0 (used for experiments logging, remove where necessary if you don't need)
- Other commonly seen dependencies can be installed via pip/conda.
In ./data_gen/background/raw/, the source background tissue image is provided. Adapt aug_bg.ipynb
to generate augmented background images.
In ./data_gen/foreground/, different types of instruments with 3 image-mask pairs are provided (Note: there are 2 versions of Bipolar Forceps in Endovis-2018 dataset.). Adapt aug_fg.ipynb
to generate augmented foreground images.
In ./data_gen/blended/, adapt blend_multi.ipynb
to generate the blended images used for training.
To evaluate the quality of the generated synthetic dataset, binary instrument segmentation is adopted.
Example commands:
- Train with Synthetic-A
python3 train.py --train_dataset Blend --blend_mode paste_1bg_2base_1tool --val_dataset Endo18_test
- Train with Synthetic-B
python3 train.py --train_dataset Blend --blend_mode paste_1bg_2base_12tool_2k --val_dataset Endo18_test
- Train with Synthetic-C
python3 train.py --train_dataset Blend --blend_mode paste_1bg_3base_12tool_2k --val_dataset Endo18_test
Part of the codes are adapted from robot-surgery-segmentation.
@inproceedings{wang2022rethinking,
title={Rethinking Surgical Instrument Segmentation: A Background Image Can Be All You Need},
author={Wang, An and Islam, Mobarakol and Xu, Mengya and Ren, Hongliang},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={355--364},
year={2022},
organization={Springer}
}
@misc{wang2022rethinking,
title={Rethinking Surgical Instrument Segmentation: A Background Image Can Be All You Need},
author={An Wang and Mobarakol Islam and Mengya Xu and Hongliang Ren},
year={2022},
eprint={2206.11804},
archivePrefix={arXiv},
primaryClass={cs.CV}
}