Implementation:Triton inference server Server Compose Py

Field	Value
Page Type	Implementation
Title	Compose_Py
Namespace	Triton_inference_server_Server
Workflow	Custom_Container_Build
Domains	Container_Build, MLOps
Last Updated	2026-02-13 17:00 GMT

Overview

Concrete Python script for building custom Triton containers by extracting selected backends from NGC images.

Description

The compose.py script automates the creation of custom Triton Inference Server containers by generating and building a Dockerfile that selectively copies pre-compiled backend libraries from NVIDIA NGC base images. The script handles image pulling, Dockerfile generation, backend selection, and Docker image building in a single invocation.

The internal execution flow is:

Parse arguments -- Process CLI flags for backend, repoagent, cache, and container version selections
get_container_version -- Determine the NGC container version to use as the source image
create_argmap -- Build internal mapping of requested components to their image paths
start_dockerfile -- Generate the initial Dockerfile header with FROM and base setup
add_requested_backends -- Add COPY directives for each requested backend's shared libraries
add_requested_repoagents -- Add COPY directives for repository agents if requested
add_requested_caches -- Add COPY directives for cache implementations if requested
end_dockerfile -- Finalize the Dockerfile with entrypoint and environment setup
build_docker_image -- Invoke docker build to produce the final image

The generated Dockerfile uses Docker multi-stage builds, referencing the full NGC image as a build stage and copying only the selected components into the final image.

Usage

Basic Compose Build

# Build with TensorRT and Python backends
python3 compose.py --backend tensorrt --backend python --enable-gpu

Compose with Specific Container Version

# Build using the 24.12 NGC container as source
python3 compose.py \
  --backend onnxruntime \
  --backend python \
  --container-version 24.12 \
  --enable-gpu

Dry Run to Inspect Generated Dockerfile

# Generate the Dockerfile without building
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --dry-run
# Inspect the generated Dockerfile.compose
cat Dockerfile.compose

Custom Output Image Name

# Build with a custom image name
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --output-name my-custom-triton \
  --enable-gpu

Code Reference

Source Location

File	Lines	Description
`compose.py`	L360-519	Main entry point and argument parsing
`compose.py`	L60-110	`start_dockerfile()` -- Generates initial Dockerfile with FROM directives and base setup
`compose.py`	L112-126	`add_requested_backends()` -- Adds COPY directives for selected backend libraries
`compose.py`	L172-186	`build_docker_image()` -- Invokes `docker build` to produce the final image

Signature

python3 compose.py \
  --backend <backend> [--backend <backend>...] \
  [--repoagent <name>] \
  [--cache <name>] \
  [--container-version <VER>] \
  [--enable-gpu] \
  [--output-name <name>] \
  [--image <min|full>,<image>] \
  [--dry-run] \
  [--skip-pull]

Import

# compose.py is a standalone script with standard library dependencies
import argparse
import os
import subprocess
import sys

Key Parameters

Parameter	Default	Description
`--backend`	(none, repeatable)	Backend name to include (e.g., tensorrt, python, onnxruntime)
`--repoagent`	(none, repeatable)	Repository agent to include
`--cache`	(none, repeatable)	Cache implementation to include
`--container-version`	Current release	NGC container version to use as source (e.g., 24.12)
`--image`	NGC defaults	Override min or full base image (`min,<image>` or `full,<image>`)
`--enable-gpu`	True	Enable GPU support in the output image
`--output-name`	tritonserver	Name for the output Docker image
`--dry-run`	False	Generate Dockerfile.compose without building
`--skip-pull`	False	Skip pulling NGC images (use local cache)

I/O Contract

Inputs

Input	Type	Description
Cloned server repository	Directory	The cloned Triton server repo containing `compose.py`
Docker daemon	Service	Running Docker daemon accessible from the build host
NGC images	Container images	Pre-built NGC Triton images (pulled automatically or from local cache)
Backend selections	CLI flags	One or more `--backend` flags specifying which backends to include

Outputs

Output	Type	Description
`Dockerfile.compose`	File	Generated Dockerfile used for the build
Docker image	Container image	Tagged as `<output-name>` (default: `tritonserver`), containing only selected backends

Usage Examples

Example 1: Minimal ONNX Runtime container

cd server
python3 compose.py --backend onnxruntime --enable-gpu
# Produces: tritonserver image with only ONNX Runtime backend
docker images tritonserver

Example 2: Multi-backend container with repository agent

cd server
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --backend onnxruntime \
  --repoagent checksum \
  --container-version 24.12 \
  --output-name triton-prod \
  --enable-gpu
# Produces: triton-prod image with three backends and checksum repo agent

Example 3: CPU-only compose build

cd server
python3 compose.py \
  --backend python \
  --backend onnxruntime
# Produces: tritonserver image without GPU support

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment