Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server GenEnsembleModelUtils

From Leeroopedia
Knowledge Sources
Domains Testing, Model_Generation
Last Updated 2026-02-13 17:00 GMT

Overview

Generates ensemble model configurations and model files used in QA testing of Triton's ensemble scheduling feature.

Description

The `gen_ensemble_model_utils.py` module provides three key ensemble schedule classes -- `AddSubEnsembleSchedule`, `IdentityEnsembleSchedule`, and `SequenceEnsembleSchedule` -- each responsible for generating model repository artifacts for different ensemble pipeline topologies. It creates the `config.pbtxt` configuration files, wires up multi-stage model pipelines, and produces the backing model files for each ensemble member. Other QA test generators import this module to build composite ensemble models that exercise Triton's ability to chain models together.

Usage

Import this module from other QA model generation scripts when you need to create ensemble model repositories for integration testing. Each schedule class provides a `create()` method to generate the full model repository structure.

Code Reference

Source Location

Signature

class AddSubEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

class IdentityEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

class SequenceEnsembleSchedule:
    def create(models_dir, max_batch, model_version=1): ...

def create_ensemble_modelconfig(model_name, max_batch, dtype, input_shapes, output_shapes, steps): ...
def create_ensemble_modelfile(models_dir, model_name, model_version): ...

Import

import gen_ensemble_model_utils as ensemble_utils

I/O Contract

Inputs

Name Type Required Description
models_dir string Yes Output directory for the generated model repository
max_batch int Yes Maximum batch size for the generated models
model_version int No Model version number (default: 1)
dtype string No Data type for model tensors (e.g., TYPE_FP32)

Outputs

Name Type Description
model_repository directory Complete model repository with config.pbtxt and model files for each ensemble member
config.pbtxt file Protobuf text configuration for each ensemble and its member models

Usage Examples

Generate AddSub Ensemble Models

import gen_ensemble_model_utils as ensemble_utils
ensemble_utils.AddSubEnsembleSchedule.create(
    models_dir="/tmp/ensemble_models",
    max_batch=8
)

Generate Identity Ensemble Models

ensemble_utils.IdentityEnsembleSchedule.create(
    models_dir="/tmp/identity_ensemble",
    max_batch=4,
    model_version=1
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment