Implementation:Triton inference server Server JetsonPeopleDetection
| Knowledge Sources | |
|---|---|
| Domains | Edge_Inference, Computer_Vision |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Example application demonstrating in-process Triton C API usage on NVIDIA Jetson for real-time people detection with dynamic batching and concurrent inference.
Description
people_detection.cc is a complete working example that demonstrates the Triton C API (in-process server mode) on Jetson hardware. It reads video frames using OpenCV, preprocesses them on the GPU using CUDA kernels, runs PeopleNet inference through an in-process Triton server with dynamic batching and concurrent execution, and renders bounding box results on the output video. The example showcases GPU-accelerated preprocessing, asynchronous inference, and integration with the Triton shared library.
Usage
Use this as a reference application when building edge inference pipelines on Jetson platforms using the Triton C API directly (without HTTP/gRPC). Demonstrates dynamic batching for throughput optimization on embedded devices.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: docs/examples/jetson/concurrency_and_dynamic_batching/people_detection.cc
- Lines: 1-1158
Signature
// Main detection pipeline
int main(int argc, char** argv);
// Custom response allocator for GPU memory
TRITONSERVER_Error* ResponseAlloc(...);
TRITONSERVER_Error* ResponseRelease(...);
// Inference completion callback
void InferResponseComplete(
TRITONSERVER_InferenceResponse* response,
const uint32_t flags, void* userp);
// CUDA preprocessing kernel (declaration)
void preprocess(
const uint8_t* input, float* output,
int batch_size, int height, int width,
int channels, cudaStream_t stream);
Import
// Standalone example - link against libtritonserver.so
#include <tritonserver.h>
#include <opencv2/opencv.hpp>
#include <cuda_runtime_api.h>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_repository | directory | Yes | Path to Triton model repository with PeopleNet |
| input_video | file | Yes | Input video file for detection |
| concurrency | int | No | Number of concurrent inference threads |
Outputs
| Name | Type | Description |
|---|---|---|
| output_video | file | Video with bounding box overlays |
| stdout | text | FPS and detection statistics |
Usage Examples
Run People Detection on Jetson
# Build the example
mkdir build && cd build
cmake .. -DTRITON_ENABLE_GPU=ON
make people_detection
# Run detection
./people_detection \
--model-repository=/models \
--input-video=traffic.mp4 \
--concurrency=4