Implementation:Triton inference server Server GenQaTrtFormatModels
| Knowledge Sources | |
|---|---|
| Domains | Testing, Model_Generation |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Generates TensorRT models with various tensor memory formats (linear, CHW2, CHW4, CHW16, CHW32, HWC8) for format handling validation.
Description
The `gen_qa_trt_format_models.py` script creates TensorRT engine plans that use non-default tensor memory formats, testing Triton's ability to correctly manage tensor data layout transformations. It builds TensorRT networks with explicit format constraints on input and output tensors, generating models for formats such as CHW2, CHW4, CHW16, CHW32, HWC8, and the default linear format. These models are used by QA tests to verify that Triton correctly handles memory format reordering when the model's internal format differs from the client-provided linear format.
Usage
Run this script when testing TensorRT-specific tensor format handling. It requires the TensorRT development libraries to be available at build time for engine plan generation.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: qa/common/gen_qa_trt_format_models.py
- Lines: 1-456
Signature
def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape, input_format, output_format): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, input_format, output_format): ...
def create_models(models_dir): ...
Import
# Typically run as a standalone script
python qa/common/gen_qa_trt_format_models.py --models_dir /tmp/models
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| models_dir | string | Yes | Output directory for generated model repository |
| input_format | string | No | TensorRT tensor format for inputs (e.g., LINEAR, CHW4, HWC8) |
| output_format | string | No | TensorRT tensor format for outputs |
Outputs
| Name | Type | Description |
|---|---|---|
| model_repository | directory | Model directories with format-specific TensorRT plans |
| config.pbtxt | file | Model configuration specifying tensor formats |
| model.plan | file | TensorRT engine plan files with format constraints |
Usage Examples
Generate TRT Format Models
python qa/common/gen_qa_trt_format_models.py \
--models_dir /tmp/trt_format_models
Verify Specific Format
MODELS_DIR="/tmp/trt_chw4_models"
python qa/common/gen_qa_trt_format_models.py --models_dir $MODELS_DIR
tritonserver --model-repository=$MODELS_DIR &