Implementation:Triton inference server Server Build Py
| Field | Value |
|---|---|
| Page Type | Implementation |
| Title | Build_Py |
| Namespace | Triton_inference_server_Server |
| Workflow | Custom_Container_Build |
| Domains | Container_Build, Build_Systems |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete Python build orchestrator for compiling Triton Inference Server from source with Docker and CMake.
Description
The build.py script is the primary entry point for source builds of Triton Inference Server. It orchestrates the entire build pipeline including argument parsing, Docker multi-stage build generation, CMake configuration, compilation of the server core and backends, and final image assembly.
The script operates in two primary modes:
- Containerized build (default): Generates and executes Docker multi-stage builds. Each build stage (base, core, backends) runs in its own Docker container, with artifacts copied between stages. The final output is a Docker image.
- Non-containerized build (
--no-container-build): Runs CMake directly on the host system, producing a local installation directory. This mode requires all build dependencies to be pre-installed on the host.
The internal execution flow for a containerized build:
- Argument parsing (L2434-3198) -- Process all CLI flags and populate the build configuration
- Version resolution -- Determine Triton version from
TRITON_VERSIONfile and resolve backend versions - Base image setup -- Generate the Docker base stage with OS packages and build tools
- Core build (L1951-2047) -- Compile
libtritonserver.soand thetritonserverexecutable using CMake - Backend builds (L2075-2138) -- For each enabled backend, clone the repository, configure with CMake, and compile
- Final image assembly -- Copy all build artifacts into a minimal runtime image
Usage
Full-Featured Source Build
# Build everything from source
python3 build.py --enable-all
Selective Source Build
# Build with specific backends and endpoints
python3 build.py \
--backend tensorrt \
--backend python \
--backend onnxruntime \
--endpoint http \
--endpoint grpc \
--enable-gpu
Debug Build
# Build with debug symbols for profiling
python3 build.py \
--backend python \
--endpoint http \
--endpoint grpc \
--build-type Debug \
--enable-gpu
Non-Containerized Local Build
# Build locally without Docker
python3 build.py \
--no-container-build \
--build-dir /tmp/triton-build \
--backend python \
--endpoint http \
--endpoint grpc \
--enable-gpu
Dry Run to Inspect Build Plan
# Show what would be built without executing
python3 build.py --enable-all --dryrun
Code Reference
Source Location
| File | Lines | Description |
|---|---|---|
build.py |
L2434-3198 | Main entry point, argument parsing, and build orchestration |
build.py |
L1951-2047 | core_build() -- Compiles the Triton server core library and executable
|
build.py |
L2075-2138 | backend_build() -- Compiles each enabled backend from source
|
CMakeLists.txt |
L27-269 | Top-level CMake configuration defining project options, dependencies, and targets |
src/CMakeLists.txt |
L27-300 | Server executable CMake configuration, linking core library and backends |
Signature
python3 build.py \
[-v] \
[--enable-all | --backend <name>[:<tag>] --endpoint <name> --filesystem <name>] \
[--no-container-build --build-dir <dir>] \
[--dryrun] \
[--build-type Release|Debug] \
[-j <N>] \
[--enable-gpu] \
[--repo-tag <component>:<tag>] \
[--extra-core-cmake-arg <arg>]
Import
# build.py is a standalone script; key internal constants:
DEFAULT_TRITON_VERSION_MAP # Maps components to default version tags (build.py:L73-83)
CORE_BACKENDS = ["ensemble"] # Always included
# Uses standard library: argparse, os, subprocess, sys, pathlib, shutil
Key Parameters
| Parameter | Default | Description |
|---|---|---|
-v |
False | Enable verbose build output |
--enable-all |
False | Enable all backends, endpoints, and filesystems |
--backend |
(none, repeatable) | Backend name with optional version tag (name:tag)
|
--endpoint |
(none, repeatable) | Network endpoint to enable: http, grpc, sagemaker, vertex-ai
|
--filesystem |
(none, repeatable) | Cloud filesystem to enable: gcs, s3, azure_storage
|
--enable-gpu |
False | Enable CUDA/TensorRT GPU support |
--no-container-build |
False | Build locally instead of in Docker containers |
--build-dir |
(auto) | Directory for build artifacts (used with --no-container-build)
|
--build-type |
Release | CMake build type: Release or Debug
|
--dryrun |
False | Show build plan without executing |
--repo-tag |
(none, repeatable) | Override repository tag for a specific component |
-j |
(auto) | Number of parallel compilation jobs |
--extra-core-cmake-arg |
(none, repeatable) | Additional CMake arguments passed to core build |
I/O Contract
Inputs
| Input | Type | Description |
|---|---|---|
| Cloned repository | Directory | The cloned Triton server repo containing build.py, CMakeLists.txt, and src/
|
| Docker daemon | Service | Running Docker daemon (required for containerized builds) |
| Build dependencies | System packages | CMake, GCC/Clang, CUDA toolkit (required for non-containerized builds) |
| Feature selections | CLI flags | Backend, endpoint, filesystem, and GPU selections |
| Build configuration | CLI flags | Build type, parallelism, extra CMake args |
Outputs
| Output | Type | Description |
|---|---|---|
| Docker image (containerized) | Container image | Image containing /opt/tritonserver/bin/tritonserver and all enabled backends under /opt/tritonserver/backends/
|
| Local build directory (non-containerized) | Directory | Build artifacts at <build-dir>/opt/tritonserver/ with bin/, lib/, and backends/
|
| Build log | stdout/stderr | Detailed build output including CMake configuration, compilation progress, and any errors |
Usage Examples
Example 1: Production GPU build with selected backends
cd server
python3 build.py \
--backend tensorrt \
--backend python \
--endpoint http \
--endpoint grpc \
--filesystem s3 \
--enable-gpu \
--build-type Release \
-j 16
# Produces a Docker image with TensorRT + Python backends, HTTP/gRPC endpoints, S3 support
Example 2: Debug build for backend development
cd server
python3 build.py \
--backend python:main \
--endpoint http \
--endpoint grpc \
--enable-gpu \
--build-type Debug \
--extra-core-cmake-arg "-DTRITON_ENABLE_LOGGING=ON"
# Produces a Docker image with debug symbols for GDB/profiling
Example 3: CPU-only local build
cd server
python3 build.py \
--no-container-build \
--build-dir /opt/triton-build \
--backend python \
--backend onnxruntime \
--endpoint http \
--endpoint grpc \
--build-type Release
# Produces local install at /opt/triton-build/opt/tritonserver/