GitHub - thisisi3/libtorch-faster-rcnn: Implement the classic Faster RCNN using libtorch library in C++.

Introduction

When it comes to building deep learning models in C++, libtorch is a good choice. Libtorch and Pytorch are essentially the same, libtorch provides the C++ interface while Pytorch provides the Python interface. Libtorch's interface is designed to be almost the same as Python's, which makes converting Pytorch code to C++ code very easy. One can literally translate Pytorch code to libtorch code line by line. What's more, one can train a model using Pytorch and use the model in C++ environment.

In this project, I use libtorch to implement the classic object detection model Faster RCNN. I hope to give one sense of how one can convert a Pytorch model to a C++ model in aspects of both train and inference. The overall structure and configuration very much follows mmdetection(v2.3.0)'s implementation of Faster RCNN. MMdetection is a well known object detection framework that implements many of the popular object detection models.

Compile and Use

requirements

opencv v4.1.4
torchvision 0.7.0
libtorch v1.6.0 release (most likely compatible with other libtorch versions but not tested)

torchvision

As we know pytorch does not come with CV operators like nms, roi_align etc. We need torchvision's C++ implementation. Here I only took code of nms, roi_align and roi_pool, and put them in cvops/ to compile together with the project.

compile with cmake(v3.19.2)

mkdir build
cd build
cmake .. -DCMAKE_PREFIX_PATH=path_to_libtorch -DCMAKE_PREFIX_PATH=path_to_opencv
cmake --build . --config Release --parallel 8

train

./build/train configs/faster_rcnn_r50_fpn_1x_voc.json --work-dir work_dir --gpu 0

inference

./build/test faster_rcnn_r50_fpn_1x_voc.json epoch_12.pt --out epoch_12.bbox.json --gpu 0

ImageNet pretrained backbones

For pytorch users, they usually go to torchvision for ImageNet pretrained weights. But those weights can not be directly loaded in libtorch. One way is to use torchscript to load model in C++, check out the official tutorial for more details. Here is what I did:

img_tsr = torch.rand(1, 3, 1000, 600)
model = torchvision.models.resnet50(True)
model.eval()
traced = torch.jit.trace(model, img_tsr)
traced.save('resnet50.pt')

IMPORTANT: If you use trace method, do remember to run eval() first, as trace will change tracked means and stds in BN layers.

Benchmark

Train: voc2007-trainval

Test: voc2007-test

	backbone	mAP	AP50
mmdet	Resnet50	0.437	0.769
this	Resnet50	0.438	0.768

VOC dataset is used as the main dataset due to limited GPU resource. But the metrics are all using coco's. Notice that there's around 0.02 variance in mAP among different trains. VOC's XML annotations are first converted to coco format and only non-difficult bboxes are used for train and test.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
cvops		cvops
data		data
frcnn		frcnn
CMakeLists.txt		CMakeLists.txt
README.md		README.md
test.cpp		test.cpp
train.cpp		train.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Compile and Use

requirements

torchvision

compile with cmake(v3.19.2)

train

inference

ImageNet pretrained backbones

Benchmark

About

Releases

Packages

Languages

thisisi3/libtorch-faster-rcnn

Folders and files

Latest commit

History

Repository files navigation

Introduction

Compile and Use

requirements

torchvision

compile with cmake(v3.19.2)

train

inference

ImageNet pretrained backbones

Benchmark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages