简体中文 | English
nndeploy is a cross-platform, high-performing, and straightforward AI model deployment framework. We strive to deliver a consistent and user-friendly experience across various inference framework backends in complex deployment environments and focus on performance.
As long as the environment is supported, the code for deploying models through nndeploy can be used across multiple platforms without modification, regardless of the operating system and inference framework.
The current supported environment is as follows, which will continue to be updated in the future:
Inference/OS | Linux | Windows | Android | MacOS | IOS | developer | remarks |
---|---|---|---|---|---|---|---|
TensorRT | √ | - | - | - | - | Always | |
OpenVINO | √ | √ | - | - | - | Always | |
ONNXRuntime | √ | √ | - | - | - | Always | |
MNN | √ | √ | √ | - | - | Always | |
TNN | √ | √ | √ | - | - | 02200059Z | |
ncnn | - | - | √ | - | - | Always | |
coreML | - | - | - | √ | - | JoDio-zd | |
paddle-lite | - | - | - | - | - | qixuxiang | |
AscendCL | √ | - | - | - | - | CYYAI |
Notice: TFLite, TVM, OpenPPL, Tengine, AITemplate, RKNN, sophgo, MindSpore-lite, Horizon are also on the agenda as we work to cover mainstream inference frameworks
The difference of model structure, inference framework and hardware resource will lead to different inference performance. nndeploy deeply understands and preserves as much as possible the features of the back-end inference framework without compromising the computational efficiency of the native inference framework with a consistent code experience. In addition, we realize the efficient connection between the pre/post-processing and the model inference process through the exquisitely designed memory zero copy, which effectively guarantees the end-to-end delay of model inference.
What's more, we are developing and refining the following:
- Thread Pool
- Memory Pool: more efficient memory allocation and release
- HPC Operators: optimize pre/post-processing efficiency
Out-of-the-box AI models are our goal, but our are focusing on development of the system at this time. Nevertheless, YOLOv5, YOLOv6, YOLOv8 are already supported, and it is believed that the list will soon be expanded.
model | Inference | developer | remarks |
---|---|---|---|
YOLOV5 | TensorRt/OpenVINO/ONNXRuntime/MNN | 02200059Z、Always | |
YOLOV6 | TensorRt/OpenVINO/ONNXRuntime | 02200059Z、Always | |
YOLOV8 | TensorRt/OpenVINO/ONNXRuntime/MNN | 02200059Z、Always |
nndeploy's primary purpose is user friendliness and high performance. We have built-in support for the major inference frameworks and provide them with a unified interface abstraction on which you can implement platform/framework independent inference code without worrying about performance loss. We now provide additional templates for the pre/post-processing for AI algorithms, which can help you simplify the end-to-end deployment process of the model, and the built-in algorithms mentioned above are also part of the ease of use.
If you have any related questions, feel free to contact us. 😁
- task parallel
- pipeline parallel
- For more information, please visit the nndeploy documentation.
- Parallel
- More Model
- More Inference
- OP
nndeploy is still in its infancy, welcome to join us.