Using Triton Inference Server as a shared library for execution on Jetson

Overview

This project demonstrates how to run C API applications using Triton Inference Server as a shared library. We also show how to build and execute such applications on Jetson.

Prerequisites

JetPack >= 4.6
OpenCV >= 4.1.1
TensorRT >= 8.0.1.6

Installation

Follow the installation instructions from the GitHub release page (https://github.com/triton-inference-server/server/releases/).

In our example, we placed the contents of downloaded release directory under /opt/tritonserver.

Part 1. Concurrent inference and dynamic batching

The purpose of the sample located under concurrency_and_dynamic_batching is to demonstrate the important features of Triton Inference Server such as concurrent model execution and dynamic batching. In order to do that, we implemented a people detection application using C API and Triton Inference Server as a shared library.

Part 2. Analyzing model performance with perf_analyzer

To analyze model performance on Jetson, perf_analyzer tool is used. The perf_analyzer is included in the release tar file or can be compiled from source.

From this directory of the repository, execute the following to evaluate model performance:

./perf_analyzer -m peoplenet -b 2 --service-kind=triton_c_api --model-repo=$(pwd)/concurrency_and_dynamic_batching/trtis_model_repo_sample_1 --triton-server-directory=/opt/tritonserver --concurrency-range 1:6 -f perf_c_api.csv

In the example above we saved the results as a .csv file. To visualize these results, follow the steps described here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Using Triton Inference Server as a shared library for execution on Jetson

Overview

Prerequisites

Installation

Part 1. Concurrent inference and dynamic batching

Part 2. Analyzing model performance with perf_analyzer

Files

README.md

Latest commit

History

README.md

File metadata and controls

Using Triton Inference Server as a shared library for execution on Jetson

Overview

Prerequisites

Installation

Part 1. Concurrent inference and dynamic batching

Part 2. Analyzing model performance with perf_analyzer