Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server Build Py

From Leeroopedia
Field Value
Page Type Implementation
Title Build_Py
Namespace Triton_inference_server_Server
Workflow Custom_Container_Build
Domains Container_Build, Build_Systems
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete Python build orchestrator for compiling Triton Inference Server from source with Docker and CMake.

Description

The build.py script is the primary entry point for source builds of Triton Inference Server. It orchestrates the entire build pipeline including argument parsing, Docker multi-stage build generation, CMake configuration, compilation of the server core and backends, and final image assembly.

The script operates in two primary modes:

  • Containerized build (default): Generates and executes Docker multi-stage builds. Each build stage (base, core, backends) runs in its own Docker container, with artifacts copied between stages. The final output is a Docker image.
  • Non-containerized build (--no-container-build): Runs CMake directly on the host system, producing a local installation directory. This mode requires all build dependencies to be pre-installed on the host.

The internal execution flow for a containerized build:

  1. Argument parsing (L2434-3198) -- Process all CLI flags and populate the build configuration
  2. Version resolution -- Determine Triton version from TRITON_VERSION file and resolve backend versions
  3. Base image setup -- Generate the Docker base stage with OS packages and build tools
  4. Core build (L1951-2047) -- Compile libtritonserver.so and the tritonserver executable using CMake
  5. Backend builds (L2075-2138) -- For each enabled backend, clone the repository, configure with CMake, and compile
  6. Final image assembly -- Copy all build artifacts into a minimal runtime image

Usage

Full-Featured Source Build

# Build everything from source
python3 build.py --enable-all

Selective Source Build

# Build with specific backends and endpoints
python3 build.py \
  --backend tensorrt \
  --backend python \
  --backend onnxruntime \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu

Debug Build

# Build with debug symbols for profiling
python3 build.py \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --build-type Debug \
  --enable-gpu

Non-Containerized Local Build

# Build locally without Docker
python3 build.py \
  --no-container-build \
  --build-dir /tmp/triton-build \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu

Dry Run to Inspect Build Plan

# Show what would be built without executing
python3 build.py --enable-all --dryrun

Code Reference

Source Location

File Lines Description
build.py L2434-3198 Main entry point, argument parsing, and build orchestration
build.py L1951-2047 core_build() -- Compiles the Triton server core library and executable
build.py L2075-2138 backend_build() -- Compiles each enabled backend from source
CMakeLists.txt L27-269 Top-level CMake configuration defining project options, dependencies, and targets
src/CMakeLists.txt L27-300 Server executable CMake configuration, linking core library and backends

Signature

python3 build.py \
  [-v] \
  [--enable-all | --backend <name>[:<tag>] --endpoint <name> --filesystem <name>] \
  [--no-container-build --build-dir <dir>] \
  [--dryrun] \
  [--build-type Release|Debug] \
  [-j <N>] \
  [--enable-gpu] \
  [--repo-tag <component>:<tag>] \
  [--extra-core-cmake-arg <arg>]

Import

# build.py is a standalone script; key internal constants:
DEFAULT_TRITON_VERSION_MAP  # Maps components to default version tags (build.py:L73-83)
CORE_BACKENDS = ["ensemble"]  # Always included
# Uses standard library: argparse, os, subprocess, sys, pathlib, shutil

Key Parameters

Parameter Default Description
-v False Enable verbose build output
--enable-all False Enable all backends, endpoints, and filesystems
--backend (none, repeatable) Backend name with optional version tag (name:tag)
--endpoint (none, repeatable) Network endpoint to enable: http, grpc, sagemaker, vertex-ai
--filesystem (none, repeatable) Cloud filesystem to enable: gcs, s3, azure_storage
--enable-gpu False Enable CUDA/TensorRT GPU support
--no-container-build False Build locally instead of in Docker containers
--build-dir (auto) Directory for build artifacts (used with --no-container-build)
--build-type Release CMake build type: Release or Debug
--dryrun False Show build plan without executing
--repo-tag (none, repeatable) Override repository tag for a specific component
-j (auto) Number of parallel compilation jobs
--extra-core-cmake-arg (none, repeatable) Additional CMake arguments passed to core build

I/O Contract

Inputs

Input Type Description
Cloned repository Directory The cloned Triton server repo containing build.py, CMakeLists.txt, and src/
Docker daemon Service Running Docker daemon (required for containerized builds)
Build dependencies System packages CMake, GCC/Clang, CUDA toolkit (required for non-containerized builds)
Feature selections CLI flags Backend, endpoint, filesystem, and GPU selections
Build configuration CLI flags Build type, parallelism, extra CMake args

Outputs

Output Type Description
Docker image (containerized) Container image Image containing /opt/tritonserver/bin/tritonserver and all enabled backends under /opt/tritonserver/backends/
Local build directory (non-containerized) Directory Build artifacts at <build-dir>/opt/tritonserver/ with bin/, lib/, and backends/
Build log stdout/stderr Detailed build output including CMake configuration, compilation progress, and any errors

Usage Examples

Example 1: Production GPU build with selected backends

cd server
python3 build.py \
  --backend tensorrt \
  --backend python \
  --endpoint http \
  --endpoint grpc \
  --filesystem s3 \
  --enable-gpu \
  --build-type Release \
  -j 16
# Produces a Docker image with TensorRT + Python backends, HTTP/gRPC endpoints, S3 support

Example 2: Debug build for backend development

cd server
python3 build.py \
  --backend python:main \
  --endpoint http \
  --endpoint grpc \
  --enable-gpu \
  --build-type Debug \
  --extra-core-cmake-arg "-DTRITON_ENABLE_LOGGING=ON"
# Produces a Docker image with debug symbols for GDB/profiling

Example 3: CPU-only local build

cd server
python3 build.py \
  --no-container-build \
  --build-dir /opt/triton-build \
  --backend python \
  --backend onnxruntime \
  --endpoint http \
  --endpoint grpc \
  --build-type Release
# Produces local install at /opt/triton-build/opt/tritonserver/

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment