This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
The repository includes:
- Source code of Mask R-CNN built on FPN and ResNet101/ResNet50
- Training code for MS COCO
- Training code for custom animals dataset
- Pre-trained weights for MS COCO
- Jupyter notebooks to visualize the detection pipeline at every step
- ParallelModel class for multi-GPU training
- Evaluation on MS COCO metrics (AP) and DICE
- Tiger
- Bear
- Elephant
- Sika deer
- Sea turtle
- Otter
- Rhesus monkey
ResNet101: https://drive.google.com/file/d/1mZXuvDaB8p-CIXcoh8F2dB2CKUTcTuGB/view?usp=share_link
ResNet50: https://drive.google.com/file/d/1k99kLo28DP3wfHptgPXtqevL8-4QrRhh/view?usp=share_link
Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.
This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.
Examples of generated masks. These then get scaled and placed on the image in the right location.
Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).
Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.
TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.
Python 3.7, TensorFlow 1.1.4, Keras 2.0.8 and other common packages listed in requirements.txt
.
To train or test on MS COCO, you'll also need:
- pycocotools (installation instructions below)
- MS COCO Dataset
- Download the 5K minival and the 35K validation-minus-minival subsets. More details in the original Faster R-CNN implementation.