Implementation:Triton inference server Server GenEnsembleModelUtils

Knowledge Sources	Triton Inference Server
Domains	Testing, Model_Generation
Last Updated	2026-02-13 17:00 GMT

Overview

Generates ensemble model configurations and model files used in QA testing of Triton's ensemble scheduling feature.

Description

The `gen_ensemble_model_utils.py` module provides three key ensemble schedule classes -- `AddSubEnsembleSchedule`, `IdentityEnsembleSchedule`, and `SequenceEnsembleSchedule` -- each responsible for generating model repository artifacts for different ensemble pipeline topologies. It creates the `config.pbtxt` configuration files, wires up multi-stage model pipelines, and produces the backing model files for each ensemble member. Other QA test generators import this module to build composite ensemble models that exercise Triton's ability to chain models together.

Usage

Import this module from other QA model generation scripts when you need to create ensemble model repositories for integration testing. Each schedule class provides a `create()` method to generate the full model repository structure.

Code Reference

Source Location

Repository: Triton Inference Server
File: qa/common/gen_ensemble_model_utils.py
Lines: 1-1218

Signature

class AddSubEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

class IdentityEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

class SequenceEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

def create_ensemble_modelconfig(model_name, max_batch, dtype, input_shapes, output_shapes, steps): ...
def create_ensemble_modelfile(models_dir, model_name, model_version): ...

Import

import gen_ensemble_model_utils as ensemble_utils

I/O Contract

Inputs

Name	Type	Required	Description
models_dir	string	Yes	Output directory for the generated model repository
max_batch	int	Yes	Maximum batch size for the generated models
model_version	int	No	Model version number (default: 1)
dtype	string	No	Data type for model tensors (e.g., TYPE_FP32)

Outputs

Name	Type	Description
model_repository	directory	Complete model repository with config.pbtxt and model files for each ensemble member
config.pbtxt	file	Protobuf text configuration for each ensemble and its member models

Usage Examples

Generate AddSub Ensemble Models

import gen_ensemble_model_utils as ensemble_utils
ensemble_utils.AddSubEnsembleSchedule.create(
    models_dir="/tmp/ensemble_models",
    max_batch=8
)

Generate Identity Ensemble Models

ensemble_utils.IdentityEnsembleSchedule.create(
    models_dir="/tmp/identity_ensemble",
    max_batch=4,
    model_version=1
)

Related Pages

Environment:Triton_inference_server_Server_GPU_CUDA_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment