This project aims to convert the mmdetection model to tensorrt model end2end.
Focus on object detection for now.
Mask support is experiment.
support:
- fp16
- int8(experiment)
- batched input
- dynamic input shape
- combination of different modules
- deepstream support
Any advices, bug reports and stars are welcome.
This project is released under the Apache 2.0 license.
- mmdet>=2.3.0
- https://github.com/grimoire/torch2trt_dynamic
- https://github.com/grimoire/amirstan_plugin
warning: please install nvidia driver <= 450.36 and cuda <= 11
Set the envoirment variable(in ~/.bashrc):
export AMIRSTAN_LIBRARY_PATH=${amirstan_plugin_root}/build/lib
git clone https://github.com/grimoire/mmdetection-to-tensorrt.git
cd mmdetection-to-tensorrt
python setup.py develop
Build docker image
# cuda10.2 tensorrt7.0 pytorch1.6
sudo docker build -t mmdet2trt_docker:v1.0 docker/
Run (will show the help for the CLI entrypoint)
sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0
Or if you want to open a terminal inside de container:
sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} --entrypoint bash mmdet2trt_docker:v1.0
Example conversion:
sudo docker run --gpus all -it --rm -v ${your_data_path}:${bind_path} mmdet2trt_docker:v1.0 ${bind_path}/config.py ${bind_path}/checkpoint.pth ${bind_path}/output.trt
how to create a tensorrt model from mmdet model (converting might take few minutes)(Might have some warning when converting.)
detail can be found in getting_started.md
mmdet2trt ${CONFIG_PATH} ${CHECKPOINT_PATH} ${OUTPUT_PATH}
Run mmdet2trt -h for help on optional arguments.
opt_shape_param=[
[
[1,3,320,320], # min shape
[1,3,800,1344], # optimize shape
[1,3,1344,1344], # max shape
]
]
max_workspace_size=1<<30 # some module and tactic need large workspace.
trt_model = mmdet2trt(cfg_path, weight_path, opt_shape_param=opt_shape_param, fp16_mode=True, max_workspace_size=max_workspace_size)
torch.save(trt_model.state_dict(), save_path)
how to use the converted model
trt_model = init_detector(save_path)
num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")
how to save the tensorrt engine
with open(engine_path, mode='wb') as f:
f.write(model_trt.state_dict()['engine'])
note that the bbox inference result did not divided by scale factor, divided by you self if needed.
play demo in demo/inference.py
getting_started.md for more detail
Most other project use pytorch=>ONNX=>tensorRT route, This repo convert pytorch=>tensorRT directly, avoid unnecessary ONNX IR. read https://github.com/NVIDIA-AI-IOT/torch2trt#how-does-it-work for detail.
- Faster R-CNN
- Cascade R-CNN
- Double-Head R-CNN
- Group Normalization
- Weight Standardization
- DCN
- SSD
- RetinaNet
- Libra R-CNN
- FCOS
- Fovea
- CARAFE
- FreeAnchor
- RepPoints
- NAS-FPN
- ATSS
- PAFPN
- FSAF
- GCNet
- Guided Anchoring
- Generalized Attention
- Dynamic R-CNN
- Hybrid Task Cascade(object detection only)
- DetectoRS
- Side-Aware Boundary Localization
- YOLOv3
- PAA
- CornerNet(WIP)
- Generalized Focal Loss
- Grid RCNN
- VFNet
- GROIE
- Mask R-CNN(experiment)
- Cascade Mask R-CNN(experiment)
Tested on:
- torch=1.6.0
- tensorrt=7.1.3.4
- mmdetection=2.5.0
- cuda=10.2
- cudnn=8.0.2.39
If you find any error, please report in the issue.
read this page if you meet any problem.
This repo is maintained by @grimoire
Discuss group: QQ:1107959378