Implementation:Triton inference server Server Build Py

Field	Value
Page Type	Implementation
Title	Build_Py
Namespace	Triton_inference_server_Server
Workflow	Custom_Container_Build
Domains	Container_Build, Build_Systems
Last Updated	2026-02-13 17:00 GMT

Overview

Concrete Python build orchestrator for compiling Triton Inference Server from source with Docker and CMake.

Description

The build.py script is the primary entry point for source builds of Triton Inference Server. It orchestrates the entire build pipeline including argument parsing, Docker multi-stage build generation, CMake configuration, compilation of the server core and backends, and final image assembly.

The script operates in two primary modes:

Containerized build (default): Generates and executes Docker multi-stage builds. Each build stage (base, core, backends) runs in its own Docker container, with artifacts copied between stages. The final output is a Docker image.
Non-containerized build (--no-container-build): Runs CMake directly on the host system, producing a local installation directory. This mode requires all build dependencies to be pre-installed on the host.

The internal execution flow for a containerized build:

Argument parsing (L2434-3198) -- Process all CLI flags and populate the build configuration
Version resolution -- Determine Triton version from TRITON_VERSION file and resolve backend versions
Base image setup -- Generate the Docker base stage with OS packages and build tools
Core build (L1951-2047) -- Compile libtritonserver.so and the tritonserver executable using CMake
Backend builds (L2075-2138) -- For each enabled backend, clone the repository, configure with CMake, and compile
Final image assembly -- Copy all build artifacts into a minimal runtime image

Usage

Full-Featured Source Build

# Build everything from source
python3 build.py --enable-all

Selective Source Build

# Build with specific backends and endpoints
python3 build.py \
  --backend tensorrt \
  --backend python \
  --backend onnxruntime \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu

Debug Build

# Build with debug symbols for profiling
python3 build.py \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --build-type Debug \
  --enable-gpu

Non-Containerized Local Build

# Build locally without Docker
python3 build.py \
  --no-container-build \
  --build-dir /tmp/triton-build \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu

Dry Run to Inspect Build Plan

# Show what would be built without executing
python3 build.py --enable-all --dryrun

Code Reference

Source Location

File	Lines	Description
`build.py`	L2434-3198	Main entry point, argument parsing, and build orchestration
`build.py`	L1951-2047	`core_build()` -- Compiles the Triton server core library and executable
`build.py`	L2075-2138	`backend_build()` -- Compiles each enabled backend from source
`CMakeLists.txt`	L27-269	Top-level CMake configuration defining project options, dependencies, and targets
`src/CMakeLists.txt`	L27-300	Server executable CMake configuration, linking core library and backends

Signature

python3 build.py \
  [-v] \
  [--enable-all | --backend <name>[:<tag>] --endpoint <name> --filesystem <name>] \
  [--no-container-build --build-dir <dir>] \
  [--dryrun] \
  [--build-type Release|Debug] \
  [-j <N>] \
  [--enable-gpu] \
  [--repo-tag <component>:<tag>] \
  [--extra-core-cmake-arg <arg>]

Import

# build.py is a standalone script; key internal constants:
DEFAULT_TRITON_VERSION_MAP  # Maps components to default version tags (build.py:L73-83)
CORE_BACKENDS = ["ensemble"]  # Always included
# Uses standard library: argparse, os, subprocess, sys, pathlib, shutil

Key Parameters

Parameter	Default	Description
`-v`	False	Enable verbose build output
`--enable-all`	False	Enable all backends, endpoints, and filesystems
`--backend`	(none, repeatable)	Backend name with optional version tag (`name:tag`)
`--endpoint`	(none, repeatable)	Network endpoint to enable: `http`, `grpc`, `sagemaker`, `vertex-ai`
`--filesystem`	(none, repeatable)	Cloud filesystem to enable: `gcs`, `s3`, `azure_storage`
`--enable-gpu`	False	Enable CUDA/TensorRT GPU support
`--no-container-build`	False	Build locally instead of in Docker containers
`--build-dir`	(auto)	Directory for build artifacts (used with `--no-container-build`)
`--build-type`	Release	CMake build type: `Release` or `Debug`
`--dryrun`	False	Show build plan without executing
`--repo-tag`	(none, repeatable)	Override repository tag for a specific component
`-j`	(auto)	Number of parallel compilation jobs
`--extra-core-cmake-arg`	(none, repeatable)	Additional CMake arguments passed to core build

I/O Contract

Inputs

Input	Type	Description
Cloned repository	Directory	The cloned Triton server repo containing `build.py`, `CMakeLists.txt`, and `src/`
Docker daemon	Service	Running Docker daemon (required for containerized builds)
Build dependencies	System packages	CMake, GCC/Clang, CUDA toolkit (required for non-containerized builds)
Feature selections	CLI flags	Backend, endpoint, filesystem, and GPU selections
Build configuration	CLI flags	Build type, parallelism, extra CMake args

Outputs

Output	Type	Description
Docker image (containerized)	Container image	Image containing `/opt/tritonserver/bin/tritonserver` and all enabled backends under `/opt/tritonserver/backends/`
Local build directory (non-containerized)	Directory	Build artifacts at `<build-dir>/opt/tritonserver/` with `bin/`, `lib/`, and `backends/`
Build log	stdout/stderr	Detailed build output including CMake configuration, compilation progress, and any errors

Usage Examples

Example 1: Production GPU build with selected backends

cd server
python3 build.py \
  --backend tensorrt \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --filesystem s3 \
  --enable-gpu \
  --build-type Release \
  -j 16
# Produces a Docker image with TensorRT + Python backends, HTTP/gRPC endpoints, S3 support

Example 2: Debug build for backend development

cd server
python3 build.py \
  --backend python:main \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu \
  --build-type Debug \
  --extra-core-cmake-arg "-DTRITON_ENABLE_LOGGING=ON"
# Produces a Docker image with debug symbols for GDB/profiling

Example 3: CPU-only local build

cd server
python3 build.py \
  --no-container-build \
  --build-dir /opt/triton-build \
  --backend python \
  --backend onnxruntime \
  --endpoint http \
  --endpoint grpc \
  --build-type Release
# Produces local install at /opt/triton-build/opt/tritonserver/

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment