Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server Compose Py

From Leeroopedia
Field Value
Page Type Implementation
Title Compose_Py
Namespace Triton_inference_server_Server
Workflow Custom_Container_Build
Domains Container_Build, MLOps
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete Python script for building custom Triton containers by extracting selected backends from NGC images.

Description

The compose.py script automates the creation of custom Triton Inference Server containers by generating and building a Dockerfile that selectively copies pre-compiled backend libraries from NVIDIA NGC base images. The script handles image pulling, Dockerfile generation, backend selection, and Docker image building in a single invocation.

The internal execution flow is:

  1. Parse arguments -- Process CLI flags for backend, repoagent, cache, and container version selections
  2. get_container_version -- Determine the NGC container version to use as the source image
  3. create_argmap -- Build internal mapping of requested components to their image paths
  4. start_dockerfile -- Generate the initial Dockerfile header with FROM and base setup
  5. add_requested_backends -- Add COPY directives for each requested backend's shared libraries
  6. add_requested_repoagents -- Add COPY directives for repository agents if requested
  7. add_requested_caches -- Add COPY directives for cache implementations if requested
  8. end_dockerfile -- Finalize the Dockerfile with entrypoint and environment setup
  9. build_docker_image -- Invoke docker build to produce the final image

The generated Dockerfile uses Docker multi-stage builds, referencing the full NGC image as a build stage and copying only the selected components into the final image.

Usage

Basic Compose Build

# Build with TensorRT and Python backends
python3 compose.py --backend tensorrt --backend python --enable-gpu

Compose with Specific Container Version

# Build using the 24.12 NGC container as source
python3 compose.py \
  --backend onnxruntime \
  --backend python \
  --container-version 24.12 \
  --enable-gpu

Dry Run to Inspect Generated Dockerfile

# Generate the Dockerfile without building
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --dry-run
# Inspect the generated Dockerfile.compose
cat Dockerfile.compose

Custom Output Image Name

# Build with a custom image name
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --output-name my-custom-triton \
  --enable-gpu

Code Reference

Source Location

File Lines Description
compose.py L360-519 Main entry point and argument parsing
compose.py L60-110 start_dockerfile() -- Generates initial Dockerfile with FROM directives and base setup
compose.py L112-126 add_requested_backends() -- Adds COPY directives for selected backend libraries
compose.py L172-186 build_docker_image() -- Invokes docker build to produce the final image

Signature

python3 compose.py \
  --backend <backend> [--backend <backend>...] \
  [--repoagent <name>] \
  [--cache <name>] \
  [--container-version <VER>] \
  [--enable-gpu] \
  [--output-name <name>] \
  [--image <min|full>,<image>] \
  [--dry-run] \
  [--skip-pull]

Import

# compose.py is a standalone script with standard library dependencies
import argparse
import os
import subprocess
import sys

Key Parameters

Parameter Default Description
--backend (none, repeatable) Backend name to include (e.g., tensorrt, python, onnxruntime)
--repoagent (none, repeatable) Repository agent to include
--cache (none, repeatable) Cache implementation to include
--container-version Current release NGC container version to use as source (e.g., 24.12)
--image NGC defaults Override min or full base image (min,<image> or full,<image>)
--enable-gpu True Enable GPU support in the output image
--output-name tritonserver Name for the output Docker image
--dry-run False Generate Dockerfile.compose without building
--skip-pull False Skip pulling NGC images (use local cache)

I/O Contract

Inputs

Input Type Description
Cloned server repository Directory The cloned Triton server repo containing compose.py
Docker daemon Service Running Docker daemon accessible from the build host
NGC images Container images Pre-built NGC Triton images (pulled automatically or from local cache)
Backend selections CLI flags One or more --backend flags specifying which backends to include

Outputs

Output Type Description
Dockerfile.compose File Generated Dockerfile used for the build
Docker image Container image Tagged as <output-name> (default: tritonserver), containing only selected backends

Usage Examples

Example 1: Minimal ONNX Runtime container

cd server
python3 compose.py --backend onnxruntime --enable-gpu
# Produces: tritonserver image with only ONNX Runtime backend
docker images tritonserver

Example 2: Multi-backend container with repository agent

cd server
python3 compose.py \
  --backend tensorrt \
  --backend python \
  --backend onnxruntime \
  --repoagent checksum \
  --container-version 24.12 \
  --output-name triton-prod \
  --enable-gpu
# Produces: triton-prod image with three backends and checksum repo agent

Example 3: CPU-only compose build

cd server
python3 compose.py \
  --backend python \
  --backend onnxruntime
# Produces: tritonserver image without GPU support

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment