Implementation:Triton inference server Server GenEnsembleModelUtils
| Knowledge Sources | |
|---|---|
| Domains | Testing, Model_Generation |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Generates ensemble model configurations and model files used in QA testing of Triton's ensemble scheduling feature.
Description
The `gen_ensemble_model_utils.py` module provides three key ensemble schedule classes -- `AddSubEnsembleSchedule`, `IdentityEnsembleSchedule`, and `SequenceEnsembleSchedule` -- each responsible for generating model repository artifacts for different ensemble pipeline topologies. It creates the `config.pbtxt` configuration files, wires up multi-stage model pipelines, and produces the backing model files for each ensemble member. Other QA test generators import this module to build composite ensemble models that exercise Triton's ability to chain models together.
Usage
Import this module from other QA model generation scripts when you need to create ensemble model repositories for integration testing. Each schedule class provides a `create()` method to generate the full model repository structure.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: qa/common/gen_ensemble_model_utils.py
- Lines: 1-1218
Signature
class AddSubEnsembleSchedule:
def create(models_dir, max_batch, model_version=1): ...
class IdentityEnsembleSchedule:
def create(models_dir, max_batch, model_version=1): ...
class SequenceEnsembleSchedule:
def create(models_dir, max_batch, model_version=1): ...
def create_ensemble_modelconfig(model_name, max_batch, dtype, input_shapes, output_shapes, steps): ...
def create_ensemble_modelfile(models_dir, model_name, model_version): ...
Import
import gen_ensemble_model_utils as ensemble_utils
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| models_dir | string | Yes | Output directory for the generated model repository |
| max_batch | int | Yes | Maximum batch size for the generated models |
| model_version | int | No | Model version number (default: 1) |
| dtype | string | No | Data type for model tensors (e.g., TYPE_FP32) |
Outputs
| Name | Type | Description |
|---|---|---|
| model_repository | directory | Complete model repository with config.pbtxt and model files for each ensemble member |
| config.pbtxt | file | Protobuf text configuration for each ensemble and its member models |
Usage Examples
Generate AddSub Ensemble Models
import gen_ensemble_model_utils as ensemble_utils
ensemble_utils.AddSubEnsembleSchedule.create(
models_dir="/tmp/ensemble_models",
max_batch=8
)
Generate Identity Ensemble Models
ensemble_utils.IdentityEnsembleSchedule.create(
models_dir="/tmp/identity_ensemble",
max_batch=4,
model_version=1
)