Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server SequenceUtil

From Leeroopedia
Knowledge Sources
Domains Testing, Sequence_Batching
Last Updated 2026-02-13 17:00 GMT

Overview

Utility library for constructing, sending, and validating sequence batching inference requests in QA tests.

Description

The `sequence_util.py` module provides the `SequenceBatcherTest` base class and associated helper functions for testing Triton's sequence batcher. It manages sequence lifecycle by sending requests with proper start/end flags and correlation IDs, supports both direct and oldest scheduling strategies, and validates that accumulated results match expected values across multi-step sequences. The module handles concurrent sequence execution using threads, timeout detection, and cross-protocol (HTTP/gRPC) sequence request construction. Most sequence batching QA tests inherit from `SequenceBatcherTest` to gain these capabilities.

Usage

Inherit from `SequenceBatcherTest` in your test class to gain access to sequence-aware inference helpers, then call methods like `check_sequence_async` or `check_sequence` to send sequenced requests and validate accumulated outputs.

Code Reference

Source Location

Signature

class SequenceBatcherTest(unittest.TestCase):
    def check_sequence(self, trial, model_name, dtype, input_shapes,
                       steps, expected_result, sequence_id, timeout_ms=0): ...
    def check_sequence_async(self, trial, model_name, dtype, input_shapes,
                             steps, expected_result, sequence_id, timeout_ms=0): ...
    def check_sequence_shape_tensor_io(self, model_name, dtype, input_shapes,
                                       steps, expected_result, sequence_id): ...
    def check_status(self, model_name, batch_exec, expected_exec_cnt): ...

Import

import sys
sys.path.insert(0, "/path/to/qa/common")
from sequence_util import SequenceBatcherTest

I/O Contract

Inputs

Name Type Required Description
model_name string Yes Name of the sequence model to test
dtype numpy.dtype Yes Data type for sequence input tensors
steps list[tuple] Yes List of (input_value, flags) tuples representing sequence steps
expected_result number Yes Expected accumulated result after all sequence steps
sequence_id int/string Yes Correlation ID for the sequence
timeout_ms int No Sequence idle timeout in milliseconds (0 for no timeout)

Outputs

Name Type Description
assertion_results None Raises AssertionError via unittest if accumulated result does not match expected
status_check None Validates model execution statistics match expected batch counts

Usage Examples

Basic Sequence Test

from sequence_util import SequenceBatcherTest

class MySequenceTest(SequenceBatcherTest):
    def test_simple_sequence(self):
        self.check_sequence("graphdef", "simple_sequence", np.int32,
                           (1,), [(1, FLAG_START), (2, 0), (3, FLAG_END)],
                           expected_result=6, sequence_id=1001)

Async Concurrent Sequences

threads = []
for seq_id in range(1, 5):
    t = threading.Thread(target=self.check_sequence_async,
                         args=("onnx", "seq_model", np.float32, (1,),
                               [(1.0, FLAG_START), (2.0, FLAG_END)],
                               3.0, seq_id))
    threads.append(t)
    t.start()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment