Implementation:Triton inference server Server GenQaImplicitModels
| Knowledge Sources | |
|---|---|
| Domains | Testing, Model_Generation |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Generates test models for sequence batching with implicit state management, where the model internally maintains state between sequence steps.
Description
The `gen_qa_implicit_models.py` script creates models that use Triton's implicit state feature for sequence batching. Unlike explicit state models where state is passed as input/output tensors, implicit state models have Triton manage the state tensors internally between inference steps. The script generates models for ONNX Runtime, TensorFlow, TensorRT, and Python backends, each configured with implicit state definitions in the model configuration. These models are essential for testing Triton's ability to persist and restore state across sequence steps without client involvement.
Usage
Run this script to generate implicit state models before executing sequence batching tests that validate Triton's internal state management. Used in CI pipelines to set up model repositories for implicit state QA test suites.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: qa/common/gen_qa_implicit_models.py
- Lines: 1-1427
Signature
def create_onnx_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_tf_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_python_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, state_pairs): ...
def create_models(models_dir, dtype, shape, no_batch=True): ...
Import
# Typically run as a standalone script
python qa/common/gen_qa_implicit_models.py --models_dir /tmp/models
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| models_dir | string | Yes | Output directory for generated model repository |
| dtype | string | No | Data type for model tensors (e.g., int32, fp32) |
| shape | list[int] | No | Tensor shape for model inputs and outputs |
| no_batch | bool | No | Whether to also generate no-batch model variants |
| state_pairs | list[tuple] | No | Pairs of (input_state, output_state) tensor names for implicit state |
Outputs
| Name | Type | Description |
|---|---|---|
| model_repository | directory | Model directories with versioned model files and config.pbtxt |
| config.pbtxt | file | Model configuration with sequence_batching and implicit state definitions |
| model files | file | Backend-specific model files (ONNX, SavedModel, TensorRT plan, Python) |
Usage Examples
Generate Implicit State Models
python qa/common/gen_qa_implicit_models.py \
--models_dir /tmp/implicit_models
CI Pipeline Usage
MODELS_DIR="${DATADIR}/qa_implicit_models"
mkdir -p $MODELS_DIR
python qa/common/gen_qa_implicit_models.py --models_dir $MODELS_DIR
run_server $MODELS_DIR
python test_implicit_state.py