This code includes detection and classification tasks in Computer Vision, and semantic segmentation task will be added later.
-
For classification, I reproduced LeNet5, VGG, AlexNet, ResNet(ResNeXt), GoogLeNet,MobileNet, shuffleNet. Then I will reproduce EiffcientNet, etc.
-
For object detection, I reproduced RetinaNet and SSD (I broke the code up into modules, such as backbone, neck, head,loss,etc.This makes it easier to modify and add code.) Of course, other object detection algorithms(like CenterNet, FCOS, YOLO series, Faster RCNN) will be added later.
-
For semantic segmentation, I'm going to reproduce FCN, Mask RCNN, DeepLab, UNet later.
-
Detailed explanation has been published on CSDN and Quora(Chinese) Zhihu.
In this project, you should create checkpoint(model save), log, results and tenshorboard(loss visualization).
1.object detection
(CenterNet, FCOS, YOLO series, Faster RCNN)
2.semantic segmentation
(FCN, Mask RCNN, DeepLab, UNet)
Install requirements with pip (you can put requirements.txt to venv/Scripts/, if you need it.)
pip install -r requirements.txt
python == 3.8.5
torch == 1.11.0+cu113
torchaudio == 0.11.0+cu113
torchvision == 0.12.0+cu113
pycocotools == 2.0.4
numpy
Cython
matplotlib
opencv-python (maybe you want to use skimage or PIL etc...)
scikit-image
tensorboard
tqdm
Please, watch FolderOrganization.txt ( There are more details.)
I use Ubuntu20.04 (OS).
project path: /data/PycharmProject
Simple-CV-master path: /data/PycharmProject/Simple-CV-Pytorch-master
|
|----checkpoints(resnet50-19c8e357.pth or retinanet_resnet50_coco.pth)
|
|----configs----|----detection----|----retinanet_coco.yaml
| |----retinanet_voc.yaml
| |----ssd300_coco.yaml
| |----ssd300_voc.yaml
|
|----data----|----classification----|----CIAR_labels.txt(cifar.py is null, this is because I just use torchvision.datasets.CIFAR10)
| | |----ImageNet_labels.txt(imagenet.py is null, this is because I just use torchvision.datasets.ImageFolder)
| |----detection----|----RetinaNet----|----coco.py
| | |----voc.py
| |----SSD----|----coco.py(/data/coco/coco2017/coco_labels.txt)
| |----voc0712.py
|
| |----automobile.png
| |----classification----|----crash_helmet.png
| | |----photocopier.png
| | |----sunflower.png
| |----detection----|----000001.jpg
| | |----000001.xml
| | |----000002.jpg
| | |----000002.xml
| | |----000003.jpg
| | |----000003.xml
|----images----|----icon----|----alexnet.png
| |----darknet19.png
| |----darknet53.png
| |----darknettiny.png
| |----googlenet.png
| |----lenet5.png
| |----mobilenet_v2.png
| |----mobilenet_v3_large.png
| |----mobilenet_v3_small.png
| |----resnet.png
| |----resnext.png
| |----retinanet.png
| |----shufflenet_v2.png
| |----ssd.png
| |----vgg.png
|
|----log(XXX[ detection or classification ]_XXX[ train or test or eval ].info.log)
|
| |----classification----|----utils----|----accuracy.py
| | | |----AverageMeter.py
| | |----lenet5.py
| | |----alexnet.py
| | |----vgg.py
| | |----resnet.py(include: resenext)
| | |----googlenet.py
| | |----mobilenet_v2.py
| | |----mobilenet_v3.py
| | |----shufflenet.py
| |----detection----|----RetinaNet----|----anchor----|----__init__.py
| | | | |----Anchor.py
| | | |----backbones----|----__init__.py
| | | | |----DarkNet.py
| | | | |----ResNet.py
| | | |----head----|----__init__.py
| | | | |----Head.py
| | | |
| | | |----loss----|----__init__.py
| | | | |----Loss.py
| | | |
| | | |----neck----|----__init__.py
|----models----| | | |----FPN.py
| | | | |----FPN.txt
| | | |----utils----|----augmentations.py
| | | | |----BBoxTransform.py
| | | | |----ClipBoxes.py
| | | | |----collate.py
| | | | |----iou.py
| | | |----RetinaNet.py
| | |
| | |----SSD----|----anchor----|----prior_box.py
| | |----backbone----|----vgg.py
| | |----box_head----|----box_predictor.py
| | | |----inference.py
| | | |----loss.py
| | |----utils----|----augmentations.py
| | | |----box_utils.py
| | | |----collate.py
| | | |----l2norm.py
| | |----ssd.py
|----options----|----detection----|----RetinaNet----|----eval_options.py
| | |----test_options.py
| | |----train_options.py
| |----SSD----|----eval_options.py
| |----test_options.py
| |----train_options.py
|----results----|----SSD----|----COCO----|----coco_bbox_results.json
| | |----VOC----|----annot_cache----|----XXX_pr.pkl
| | | |----det----|----det_test_xxx.txt(eg: car AP)
| | | |----annots.pkl
| | | |----detections.pkl
| | | |----visualize.txt
| | |----XX(name: 000478)_XX(coco or voc).jpg
| |----RetinaNet----|----COCO----|----coco_bbox_results.json
| | |----VOC----|----annot_cache----|----XXX_pr.pkl
| | |----det----|----det_test_xxx.txt(eg: car)
| | |----annots.pkl
| | |----detections.pkl
| |----XX(name:000478)_XX(coco or voc).jpg
|----tensorboard(Loss Visualization)
|----tools----|----classification----|----eval.py
| | |----train.py
| | |----test.py
| |----detection----|----RetinaNet----|----eval_coco.py
| | |----eval_voc.py
| | |----train.py
| | |----visualize.py
| |----SSD----|----eval_coco.py
| |----eval_voc.py
| |----train.py
| |----visualize.py
| |----get_logger.py
|----utils----|----optimizer.py
| |----path.py
| |----scheduler.py
|----FolderOrganization.txt
|----main.py
|----README.md
|----requirements.txt
Since there is not much time to adjust the accuracy, the accuracy of all models will be lower than the accuracy of the model in the paper.If you want to use the models in this project, you need to readjust the parameters and accuracy.
-
Reproduce network architectures
1).EfficientNet
(finished)
1).LeNet5(models/classification/lenet5.py)[1]
I add nn.BatchNorm2d(). This is because that I was so upset about the poor accuracy.
basenet: lenet5 (image size: 32 * 32 * 3)
dataset: cifar
batch_size: 32
optim: SGD
lr: 0.01
momentum: 0.9
weight_decay: 1e-4
scheduler: MultiStepLR
milestones: [15, 20, 30]
gamma: 0.1
poch: 30
epochs | times | avg top1 acc (%) | avg top5 acc (%) |
---|---|---|---|
30 | 0h11m44s | 62.21 | 95.97 |
2).AlexNet(models/classification/alexnet.py)[2]
I add nn.BatchNorm2d(). This is because that I was so upset about the poor accuracy.
basenet: AlexNet (image size: 224 * 224 * 3)
dataset: cifar
batch_size: 32
optim: SGD
gamma: 0.1
momentum: 0.9
weight_decay: 1e-4
scheduler: MultiStepLR
milestones: [15, 20, 30]
lr: 0.01
epoch: 30
epochs | times | avg top1 acc (%) | avg top5 acc (%) |
---|---|---|---|
30 | 0h22m44s | 86.27 | 99.00 |
3).VGG(models/classification/vgg.py)[3]
I add nn.BatchNorm2d() and transfer learning. This is because that I was so upset about the poor accuracy.
basenet: vgg16 (image size: 224 * 224 * 3)
dataset: cifar
batch_size: 32
optim: SGD
lr: 0.01
momentum: 0.9
weight_decay: 1e-4
scheduler: MultiStepLR
milestones: [15, 20, 30]
gamma: 0.1
epoch: 30
epochs | times | avg top1 acc (%) | avg top5 acc (%) |
---|---|---|---|
30 | 1h23m43s | 76.56 | 96.44 |
4).ResNet(models/classification/resnet.py)[4]
basenet: resnet18
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: MultiStepLR
milestones: [15, 20, 30]
gamma: 0.1
epoch: 30
No.epoch | times/epoch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h49min35s | 50.21 | 75.59 |
5).ResNetXt(models/classification/resnet.py include: resnext50_32x4d,resnext101_32x8d)[5]
basenet: resnext50_32x4d
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/eopch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
7 | 4h5min16s | 72.28 | 91.56 |
6).GoogLeNet(models/classification/googlenet.py)[6]
basenet: GoogLeNet
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.01
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/eopch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h54min31s | 42.70 | 69.34 |
7).MobileNet(models/classification/mobilenet_v2.py or mobilenet_v3.py)
a).MobileNet_v2[7]
basenet: MobileNet_v2
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/epoch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h58min3s | 66.90 | 88.19 |
b).MobileNet_v3[8]
basenet: MobileNet_v3_large
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/epoch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h58min13s | 71.15 | 90.32 |
basenet: MobileNet_v3_small
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/epoch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h54min38s | 68.89 | 88.92 |
8).ShuffleNet v2(models/classification/shufflenet.py)[9]
basenet: ShuffleNet_v2_x0_5
dataset: ImageNet
batch_size: 32
optim: SGD
lr: 0.001
momentum: 0.9
weight_decay: 1e-4
scheduler: ReduceLROnPlateau
patience: 2
epoch: 30
pretrained: True
No.epoch | times/epoch | top1 acc (%) | top5 acc (%) |
---|---|---|---|
5 | 3h52min0s | 55.61 | 78.84 |
#!/bin/bash
conda activate base
python /data/PycharmProject/Simple-CV-Pytorch-master/tools/classification/XXX.py(train.py or eval.py or test.py)
Although all models use COCO and VOC datasets, they are processed differently, so each model has its own data( dataloader), train, test and eval.
-
Reproduce network architectures
- CenterNet
- FCOS
- YOLO series
- Faster RCNN
(finished)
1.SSD(models/detection/SSD/ssd.py)[10]
Network: ssd
backbone: vgg+add_extras
loss: cls(cross_entropy_loss)+reg(smooth_l1_loss)
dataset: voc
batch_size: 16
optim: SGD
lr: 0.001
scheduler: adjust_learning_rate
epoch: 115
epochs | batch norm | Mean AP (%) | Download Baidu yun | Key |
---|---|---|---|---|
115 | False | 75.4 | Link | xwaw |
115 | True | 76.2 | Link | 2xzk |
Network: ssd
backbone: vgg+add_extras
loss: cls(cross_entropy_loss)+reg(smooth_l1_loss)
dataset: coco
batch_size: 16
optim: SGD
lr: 0.001
scheduler: adjust_learning_rate
epoch: 55
epochs | batch norm | IoU=0.5 AP(%) | Download Baidu yun | Key |
---|---|---|---|---|
55 | False | 38.0 | Link | j6wn |
55 | True | 37.7 | Link | 7i64 |
2.RetinaNet(models/detection/RetinaNet.py)[11]
Do not show
Network: RetinaNet
backbone: ResNet50
neck: FPN
loss: Focal Loss
dataset: voc
batch_size: 4
optim: Adam
lr: 0.0001
scheduler: WarmupCosineSchedule
epoch: 80
epochs | AP(%) | Download Baidu yun | Key |
---|---|---|---|
80 | 70.1 | Link | dww8 |
Network: RetinaNet
backbone: ResNet50
neck: FPN
loss: Focal Loss
dataset: coco
batch_size: 4
optim: Adam
lr: 0.0001
scheduler: ReduceLROnPlateau
patience: 3
epoch: 30
pretrained: True
epochs | AP(%) | Download Baidu yun | Key |
---|---|---|---|
30 | 29.3 | Link | 5vak |
#!/bin/bash
conda activate base
python /data/PycharmProject/Simple-CV-Pytorch-master/tools/detection/XXX(eg:SSD or RetinaNet)/XXX.py(train.py or eval_coco.py or eval_voc.py or visualize.py)
- Reproduce network architectures
- FCN
- DeepLab
- U-Net