Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server GenQaDynaSequenceModels

From Leeroopedia
Revision as of 13:57, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Triton_inference_server_Server_GenQaDynaSequenceModels.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Testing, Model_Generation
Last Updated 2026-02-13 17:00 GMT

Overview

Generates test models for dynamic sequence batching across all supported backends.

Description

The `gen_qa_dyna_sequence_models.py` script produces model repository artifacts designed to test Triton's dynamic sequence batching capabilities. Dynamic sequence batching allows multiple sequences to be batched together dynamically, with the batcher managing correlation IDs, sequence start/end flags, and ready signals. The script generates models for TensorRT, ONNX Runtime, TensorFlow SavedModel, TorchScript, and OpenVINO backends, each configured with the sequence batcher control inputs. These models accumulate values across a sequence, enabling QA tests to verify correct sequence tracking and state management.

Usage

Execute this script to generate dynamic sequence batcher test models prior to running the corresponding QA test suites. It is typically called from CI shell scripts that set up the model repository before launching the Triton server.

Code Reference

Source Location

Signature

def create_onnx_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_tf_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_openvino_modelfile(models_dir, model_version, max_batch, dtype, shape): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, corrid_type): ...
def create_models(models_dir, dtype, shape, no_batch=True): ...

Import

# Typically run as a standalone script
python qa/common/gen_qa_dyna_sequence_models.py --models_dir /tmp/models

I/O Contract

Inputs

Name Type Required Description
models_dir string Yes Output directory for generated model repository
dtype string No Data type for model tensors (e.g., int32, fp32)
shape list[int] No Tensor shape for model inputs and outputs
no_batch bool No Whether to also generate no-batch model variants
corrid_type string No Correlation ID type: uint32, uint64, or string

Outputs

Name Type Description
model_repository directory Model directories with versioned model files and config.pbtxt for each backend
config.pbtxt file Model configuration with sequence_batching and control input definitions
model files file Backend-specific model files (ONNX, SavedModel, TensorRT plan, TorchScript, OpenVINO)

Usage Examples

Generate Dynamic Sequence Models

python qa/common/gen_qa_dyna_sequence_models.py \
    --models_dir /tmp/dyna_sequence_models

Integration in QA Test Script

MODELS_DIR="${PWD}/qa_dyna_seq_models"
mkdir -p $MODELS_DIR
python qa/common/gen_qa_dyna_sequence_models.py --models_dir $MODELS_DIR
SERVER_ARGS="--model-repository=$MODELS_DIR"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment