Implementation:Triton inference server Server GenQaSequenceModels
| Knowledge Sources | |
|---|---|
| Domains | Testing, Model_Generation |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Generates sequence batching test models that accumulate values across inference steps within a sequence.
Description
The `gen_qa_sequence_models.py` script produces models configured with Triton's sequence batcher, which groups inference requests by sequence ID and routes them to the same model instance. The generated models accept a value input along with sequence control signals (start, end, ready, correlation ID) and accumulate the input values across the sequence, enabling tests to verify that Triton correctly maintains per-sequence state. Models are generated for TensorRT, ONNX, TensorFlow, TorchScript, and OpenVINO backends, each implementing the accumulation logic natively.
Usage
Execute this script to create sequence batcher test models before running QA tests that validate sequence batching behavior, including sequence routing, timeout handling, and state management.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: qa/common/gen_qa_sequence_models.py
- Lines: 1-1238
Signature
def create_onnx_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_tf_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_openvino_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_libtorch_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape): ...
def create_models(models_dir, dtype, shape, no_batch=True): ...
Import
# Typically run as a standalone script
python qa/common/gen_qa_sequence_models.py --models_dir /tmp/models
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| models_dir | string | Yes | Output directory for generated model repository |
| dtype | string | No | Data type for model tensors (e.g., int32, fp32) |
| shape | list[int] | No | Tensor shape for model inputs and outputs |
| no_batch | bool | No | Whether to also generate no-batch model variants |
Outputs
| Name | Type | Description |
|---|---|---|
| model_repository | directory | Model directories with sequence batcher configurations |
| config.pbtxt | file | Model configuration with sequence_batching control inputs and states |
| model files | file | Backend-specific model files implementing value accumulation across sequences |
Usage Examples
Generate Sequence Models
python qa/common/gen_qa_sequence_models.py \
--models_dir /tmp/sequence_models
Use with Triton Server
MODELS_DIR="/opt/triton_qa/sequence_models"
python qa/common/gen_qa_sequence_models.py --models_dir $MODELS_DIR
tritonserver --model-repository=$MODELS_DIR &
python -m pytest test_sequence_batcher.py