Implementation:Triton inference server Server Compose Py
| Field | Value |
|---|---|
| Page Type | Implementation |
| Title | Compose_Py |
| Namespace | Triton_inference_server_Server |
| Workflow | Custom_Container_Build |
| Domains | Container_Build, MLOps |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete Python script for building custom Triton containers by extracting selected backends from NGC images.
Description
The compose.py script automates the creation of custom Triton Inference Server containers by generating and building a Dockerfile that selectively copies pre-compiled backend libraries from NVIDIA NGC base images. The script handles image pulling, Dockerfile generation, backend selection, and Docker image building in a single invocation.
The internal execution flow is:
- Parse arguments -- Process CLI flags for backend, repoagent, cache, and container version selections
- get_container_version -- Determine the NGC container version to use as the source image
- create_argmap -- Build internal mapping of requested components to their image paths
- start_dockerfile -- Generate the initial Dockerfile header with FROM and base setup
- add_requested_backends -- Add COPY directives for each requested backend's shared libraries
- add_requested_repoagents -- Add COPY directives for repository agents if requested
- add_requested_caches -- Add COPY directives for cache implementations if requested
- end_dockerfile -- Finalize the Dockerfile with entrypoint and environment setup
- build_docker_image -- Invoke
docker buildto produce the final image
The generated Dockerfile uses Docker multi-stage builds, referencing the full NGC image as a build stage and copying only the selected components into the final image.
Usage
Basic Compose Build
# Build with TensorRT and Python backends
python3 compose.py --backend tensorrt --backend python --enable-gpu
Compose with Specific Container Version
# Build using the 24.12 NGC container as source
python3 compose.py \
--backend onnxruntime \
--backend python \
--container-version 24.12 \
--enable-gpu
Dry Run to Inspect Generated Dockerfile
# Generate the Dockerfile without building
python3 compose.py \
--backend tensorrt \
--backend python \
--dry-run
# Inspect the generated Dockerfile.compose
cat Dockerfile.compose
Custom Output Image Name
# Build with a custom image name
python3 compose.py \
--backend tensorrt \
--backend python \
--output-name my-custom-triton \
--enable-gpu
Code Reference
Source Location
| File | Lines | Description |
|---|---|---|
compose.py |
L360-519 | Main entry point and argument parsing |
compose.py |
L60-110 | start_dockerfile() -- Generates initial Dockerfile with FROM directives and base setup
|
compose.py |
L112-126 | add_requested_backends() -- Adds COPY directives for selected backend libraries
|
compose.py |
L172-186 | build_docker_image() -- Invokes docker build to produce the final image
|
Signature
python3 compose.py \
--backend <backend> [--backend <backend>...] \
[--repoagent <name>] \
[--cache <name>] \
[--container-version <VER>] \
[--enable-gpu] \
[--output-name <name>] \
[--image <min|full>,<image>] \
[--dry-run] \
[--skip-pull]
Import
# compose.py is a standalone script with standard library dependencies
import argparse
import os
import subprocess
import sys
Key Parameters
| Parameter | Default | Description |
|---|---|---|
--backend |
(none, repeatable) | Backend name to include (e.g., tensorrt, python, onnxruntime) |
--repoagent |
(none, repeatable) | Repository agent to include |
--cache |
(none, repeatable) | Cache implementation to include |
--container-version |
Current release | NGC container version to use as source (e.g., 24.12) |
--image |
NGC defaults | Override min or full base image (min,<image> or full,<image>)
|
--enable-gpu |
True | Enable GPU support in the output image |
--output-name |
tritonserver | Name for the output Docker image |
--dry-run |
False | Generate Dockerfile.compose without building |
--skip-pull |
False | Skip pulling NGC images (use local cache) |
I/O Contract
Inputs
| Input | Type | Description |
|---|---|---|
| Cloned server repository | Directory | The cloned Triton server repo containing compose.py
|
| Docker daemon | Service | Running Docker daemon accessible from the build host |
| NGC images | Container images | Pre-built NGC Triton images (pulled automatically or from local cache) |
| Backend selections | CLI flags | One or more --backend flags specifying which backends to include
|
Outputs
| Output | Type | Description |
|---|---|---|
Dockerfile.compose |
File | Generated Dockerfile used for the build |
| Docker image | Container image | Tagged as <output-name> (default: tritonserver), containing only selected backends
|
Usage Examples
Example 1: Minimal ONNX Runtime container
cd server
python3 compose.py --backend onnxruntime --enable-gpu
# Produces: tritonserver image with only ONNX Runtime backend
docker images tritonserver
Example 2: Multi-backend container with repository agent
cd server
python3 compose.py \
--backend tensorrt \
--backend python \
--backend onnxruntime \
--repoagent checksum \
--container-version 24.12 \
--output-name triton-prod \
--enable-gpu
# Produces: triton-prod image with three backends and checksum repo agent
Example 3: CPU-only compose build
cd server
python3 compose.py \
--backend python \
--backend onnxruntime
# Produces: tritonserver image without GPU support