Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server L0 Trt Dynamic Shape Test

From Leeroopedia


L0 TRT Dynamic Shape Test

Source File: qa/L0_trt_dynamic_shape/test.sh
Language: Bash (404 lines)
Domains: Testing, TensorRT

Purpose

This QA test shell script validates TensorRT dynamic shape inference in Triton Inference Server. It tests optimization profile selection, enforcement of shape dimension bounds, handling of multiple optimization profiles (both dynamic and static shapes), wrong profile specification, and dynamic batching with profile-per-batch-size configurations.

Signature

#!/bin/bash
# Primary entry point: test.sh [REPO_VERSION]
# Environment variables:
#   NVIDIA_TRITON_SERVER_VERSION - Repository version
#   CUDA_VISIBLE_DEVICES - GPU device selection (set to 0)
#
# External tools:
#   perf_analyzer           - Performance analysis client
#   trt_dynamic_shape_test.py - Python test cases (TrtDynamicShapeTest class)

Key Components

Shape Boundary Enforcement

Tests that TensorRT correctly rejects inference requests with shapes outside the optimization profile bounds. For a model with shape range [4, 32], shapes of 33 (above max) and 3 (below min) are both tested and expected to return specific error messages.

$PERF_CLIENT -v -i grpc -u localhost:8001 -m plan_float32_float32_float32-4-32 \
    --shape INPUT0:33 --shape INPUT1:33 -t 1 -p2000 -b 1
EXPECTED_MESSAGE="model expected the shape of dimension 1 to be between 4 and 32 but received"

Multiple Optimization Profiles

The test model plan_float32_float32_float32 contains 10 optimization profiles (indices 0-9) with varying min/opt/max shape configurations:

# Profile configurations (min, opt, max, index):
# [1, 1], [1, 16], [8, 33], 0
# [1, 1], [2, 16], [7, 32], 1
# [1, 1], [3, 16], [6, 32], 2
# [1, 1], [4, 16], [5, 32], 3
# [5, 1], [6, 16], [8, 32], 4
# [6, 1], [6, 16], [8, 32], 5
# [1, 1], [1, 16], [8, 32], 6
# [1, 33], [1, 33], [1, 33], 7 (static)
# [3, 33], [3, 33], [3, 33], 8 (static)
# [5, 33], [5, 33], [5, 33], 9 (static)

Test Cases

Test Name Description Configuration
test_load_specific_optimization_profile Loads only profile 5 and validates inference profile: ["5"]
test_load_default_optimization_profile Uses default profile (first available) Profile field cleared
test_select_optimization_profile (best fit) Loads profiles 0-3, sends shape [4,16], expects profile 3 profile: ["0","1","2","3"]
test_select_optimization_profile (allowed) Loads profiles 0,5, sends shape [4,16], expects profile 0 (profile 5 requires min batch 6) profile: ["0","5"]
test_load_wrong_optimization_profile Attempts to load non-existent profile 100 profile: ["100"]

Static Shape Profiles

Tests that static shape profiles (7, 8, 9) work correctly with the autocomplete feature (--strict-model-config=false). Validates that batch size 5 succeeds (max across profiles), batch size 6 fails, and batch size 2 with shape 33 fails because no profile supports batch dimension 2 with that shape.

(cd ${DATADIR}/plan_float32_float32_float32/ && \
    echo "instance_group { profile : [\"7\", \"8\", \"9\" ] }" >> config.pbtxt)
SERVER_ARGS="--model-repository=$DATADIR --strict-model-config=false"

Dynamic Batching with Profiles

Tests profiles 10-17, each supporting a different fixed batch size (1-8) with dynamic shapes. Validates that dynamic_batching {} works correctly when combined with per-batch-size optimization profiles using 16 concurrent threads.

Test Flow

  1. Load single-profile model and test shape boundary enforcement
  2. Set up multi-profile model with dynamic shapes
  3. Test specific profile loading and validation
  4. Test default profile selection
  5. Test best-fit profile selection with verbose server logging
  6. Test profile selection respecting min dimension constraints
  7. Test error handling for invalid profile indices
  8. Test static shape profiles with autocomplete
  9. Test dynamic batching with per-batch-size profiles

Dependencies

  • perf_analyzer - NVIDIA performance analysis tool
  • trt_dynamic_shape_test.py - Python unittest-based test cases
  • ../common/util.sh - Common test utility functions
  • TensorRT plan models from qa_variable_model_repository

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment