Implementation:Triton inference server Server GenQaTrtFormatModels

Knowledge Sources	Triton Inference Server
Domains	Testing, Model_Generation
Last Updated	2026-02-13 17:00 GMT

Overview

Generates TensorRT models with various tensor memory formats (linear, CHW2, CHW4, CHW16, CHW32, HWC8) for format handling validation.

Description

The `gen_qa_trt_format_models.py` script creates TensorRT engine plans that use non-default tensor memory formats, testing Triton's ability to correctly manage tensor data layout transformations. It builds TensorRT networks with explicit format constraints on input and output tensors, generating models for formats such as CHW2, CHW4, CHW16, CHW32, HWC8, and the default linear format. These models are used by QA tests to verify that Triton correctly handles memory format reordering when the model's internal format differs from the client-provided linear format.

Usage

Run this script when testing TensorRT-specific tensor format handling. It requires the TensorRT development libraries to be available at build time for engine plan generation.

Code Reference

Source Location

Repository: Triton Inference Server
File: qa/common/gen_qa_trt_format_models.py
Lines: 1-456

Signature

def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape, input_format, output_format): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, input_format, output_format): ...
def create_models(models_dir): ...

Import

# Typically run as a standalone script
python qa/common/gen_qa_trt_format_models.py --models_dir /tmp/models

I/O Contract

Inputs

Name	Type	Required	Description
models_dir	string	Yes	Output directory for generated model repository
input_format	string	No	TensorRT tensor format for inputs (e.g., LINEAR, CHW4, HWC8)
output_format	string	No	TensorRT tensor format for outputs

Outputs

Name	Type	Description
model_repository	directory	Model directories with format-specific TensorRT plans
config.pbtxt	file	Model configuration specifying tensor formats
model.plan	file	TensorRT engine plan files with format constraints

Usage Examples

Generate TRT Format Models

python qa/common/gen_qa_trt_format_models.py \
    --models_dir /tmp/trt_format_models

Verify Specific Format

MODELS_DIR="/tmp/trt_chw4_models"
python qa/common/gen_qa_trt_format_models.py --models_dir $MODELS_DIR
tritonserver --model-repository=$MODELS_DIR &

Related Pages

Environment:Triton_inference_server_Server_GPU_CUDA_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment