Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server GenQaTrtFormatModels

From Leeroopedia
Knowledge Sources
Domains Testing, Model_Generation
Last Updated 2026-02-13 17:00 GMT

Overview

Generates TensorRT models with various tensor memory formats (linear, CHW2, CHW4, CHW16, CHW32, HWC8) for format handling validation.

Description

The `gen_qa_trt_format_models.py` script creates TensorRT engine plans that use non-default tensor memory formats, testing Triton's ability to correctly manage tensor data layout transformations. It builds TensorRT networks with explicit format constraints on input and output tensors, generating models for formats such as CHW2, CHW4, CHW16, CHW32, HWC8, and the default linear format. These models are used by QA tests to verify that Triton correctly handles memory format reordering when the model's internal format differs from the client-provided linear format.

Usage

Run this script when testing TensorRT-specific tensor format handling. It requires the TensorRT development libraries to be available at build time for engine plan generation.

Code Reference

Source Location

Signature

def create_plan_modelfile(models_dir, model_version, max_batch, dtype, shape, input_format, output_format): ...
def create_modelconfig(models_dir, model_name, max_batch, dtype, shape, input_format, output_format): ...
def create_models(models_dir): ...

Import

# Typically run as a standalone script
python qa/common/gen_qa_trt_format_models.py --models_dir /tmp/models

I/O Contract

Inputs

Name Type Required Description
models_dir string Yes Output directory for generated model repository
input_format string No TensorRT tensor format for inputs (e.g., LINEAR, CHW4, HWC8)
output_format string No TensorRT tensor format for outputs

Outputs

Name Type Description
model_repository directory Model directories with format-specific TensorRT plans
config.pbtxt file Model configuration specifying tensor formats
model.plan file TensorRT engine plan files with format constraints

Usage Examples

Generate TRT Format Models

python qa/common/gen_qa_trt_format_models.py \
    --models_dir /tmp/trt_format_models

Verify Specific Format

MODELS_DIR="/tmp/trt_chw4_models"
python qa/common/gen_qa_trt_format_models.py --models_dir $MODELS_DIR
tritonserver --model-repository=$MODELS_DIR &

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment