Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server GenQaImplicitModels

From Leeroopedia
Knowledge Sources
Domains Testing, Model_Generation
Last Updated 2026-02-13 17:00 GMT

Overview

Generates test models for sequence batching with implicit state management, where the model internally maintains state between sequence steps.

Description

The `gen_qa_implicit_models.py` script creates models that use Triton's implicit state feature for sequence batching. Unlike explicit state models where state is passed as input/output tensors, implicit state models have Triton manage the state tensors internally between inference steps. The script generates models for ONNX Runtime, TensorFlow, TensorRT, and Python backends, each configured with implicit state definitions in the model configuration. These models are essential for testing Triton's ability to persist and restore state across sequence steps without client involvement.

Usage

Run this script to generate implicit state models before executing sequence batching tests that validate Triton's internal state management. Used in CI pipelines to set up model repositories for implicit state QA test suites.

Code Reference

Source Location

Signature

def create_onnx_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_tf_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_python_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, state_pairs): ...
def create_models(models_dir, dtype, shape, no_batch=True): ...

Import

# Typically run as a standalone script
python qa/common/gen_qa_implicit_models.py --models_dir /tmp/models

I/O Contract

Inputs

Name Type Required Description
models_dir string Yes Output directory for generated model repository
dtype string No Data type for model tensors (e.g., int32, fp32)
shape list[int] No Tensor shape for model inputs and outputs
no_batch bool No Whether to also generate no-batch model variants
state_pairs list[tuple] No Pairs of (input_state, output_state) tensor names for implicit state

Outputs

Name Type Description
model_repository directory Model directories with versioned model files and config.pbtxt
config.pbtxt file Model configuration with sequence_batching and implicit state definitions
model files file Backend-specific model files (ONNX, SavedModel, TensorRT plan, Python)

Usage Examples

Generate Implicit State Models

python qa/common/gen_qa_implicit_models.py \
    --models_dir /tmp/implicit_models

CI Pipeline Usage

MODELS_DIR="${DATADIR}/qa_implicit_models"
mkdir -p $MODELS_DIR
python qa/common/gen_qa_implicit_models.py --models_dir $MODELS_DIR
run_server $MODELS_DIR
python test_implicit_state.py

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment