Principle:Triton inference server Server Source Build

Field	Value
Page Type	Principle
Title	Source_Build
Namespace	Triton_inference_server_Server
Workflow	Custom_Container_Build
Domains	Container_Build, Build_Systems
Knowledge Sources	Triton Server, Triton Build Guide
Last Updated	2026-02-13 17:00 GMT

Overview

Method of compiling inference server software from source code with full control over features, optimizations, and dependencies.

Description

Source build compiles Triton Inference Server from C++ source using CMake, providing maximum control over features, debug symbols, custom backends, and optimizations. This path is required for modifying server behavior, adding custom backends, or building for unsupported platforms. The build orchestrator (build.py) manages Docker multi-stage builds, CMake configuration, and dependency resolution.

The source build process involves multiple stages:

Configuration stage: CMake processes the top-level CMakeLists.txt and generates build files based on selected features. Each enabled backend, endpoint, and filesystem triggers inclusion of corresponding CMake targets.
Dependency resolution stage: CMake's FetchContent and ExternalProject modules download and build required third-party dependencies (Protobuf, gRPC, RapidJSON, re2, etc.).
Core compilation stage: The Triton core server library and executable are compiled from the C++ sources in src/.
Backend compilation stage: Each enabled backend is cloned from its respective repository, compiled against the Triton backend API, and installed as a shared library.
Packaging stage: The compiled binaries, libraries, headers, and configuration files are assembled into either a Docker image or a local install directory.

Source builds are required when:

Custom C++ or Python backends need to be compiled into the server
Server source code modifications are needed (e.g., custom logging, metrics, or endpoint behavior)
Debug symbols are required for profiling or debugging
The target platform is not covered by NGC pre-built images (e.g., ARM, non-standard Linux distributions)
Specific compiler flags or optimizations are needed

Usage

The source build path is used when the compose approach is insufficient. It requires a build environment with Docker (for containerized builds) or a full C++ toolchain (for non-containerized builds).

Typical scenarios:

Custom backend development: Build with a locally modified backend alongside standard backends
Debug build: Compile with --build-type Debug to include debug symbols for GDB or profiling
Platform porting: Build for ARM64 or other architectures not available as NGC images
Feature experimentation: Modify server core behavior and rebuild to test changes

Theoretical Basis

The principle follows a multi-stage compilation pattern:

Configure (CMake) -- Define build targets, detect platform capabilities, resolve feature flags
Fetch dependencies (FetchContent/ExternalProject) -- Download and build required third-party libraries
Compile core -- Build the Triton server core library (libtritonserver.so) and executable (tritonserver)
Compile backends -- Build each enabled backend as a shared library plugin
Package (Docker image or local install) -- Assemble all artifacts into a deployable unit

The key tradeoffs compared to compose builds:

Factor	Source Build	Compose Build
Build time	Hours (full compilation)	Minutes (binary extraction)
Customization	Full (any source change)	None (pre-built binaries only)
Debug support	Yes (`--build-type Debug`)	No
Custom backends	Yes	No
Platform flexibility	Any supported by toolchain	NGC platforms only
Build toolchain	Required (CMake, GCC/Clang, CUDA)	Docker only

The source build implements the orchestrated compilation pattern: a high-level script (build.py) coordinates multiple lower-level build systems (CMake, Make/Ninja, Docker) to produce the final artifact.

Related Pages

Implementation:Triton_inference_server_Server_Build_Py

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment