Implementation:Triton inference server Server SequenceUtil
| Knowledge Sources | |
|---|---|
| Domains | Testing, Sequence_Batching |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Utility library for constructing, sending, and validating sequence batching inference requests in QA tests.
Description
The `sequence_util.py` module provides the `SequenceBatcherTest` base class and associated helper functions for testing Triton's sequence batcher. It manages sequence lifecycle by sending requests with proper start/end flags and correlation IDs, supports both direct and oldest scheduling strategies, and validates that accumulated results match expected values across multi-step sequences. The module handles concurrent sequence execution using threads, timeout detection, and cross-protocol (HTTP/gRPC) sequence request construction. Most sequence batching QA tests inherit from `SequenceBatcherTest` to gain these capabilities.
Usage
Inherit from `SequenceBatcherTest` in your test class to gain access to sequence-aware inference helpers, then call methods like `check_sequence_async` or `check_sequence` to send sequenced requests and validate accumulated outputs.
Code Reference
Source Location
- Repository: Triton Inference Server
- File: qa/common/sequence_util.py
- Lines: 1-1193
Signature
class SequenceBatcherTest(unittest.TestCase):
def check_sequence(self, trial, model_name, dtype, input_shapes,
steps, expected_result, sequence_id, timeout_ms=0): ...
def check_sequence_async(self, trial, model_name, dtype, input_shapes,
steps, expected_result, sequence_id, timeout_ms=0): ...
def check_sequence_shape_tensor_io(self, model_name, dtype, input_shapes,
steps, expected_result, sequence_id): ...
def check_status(self, model_name, batch_exec, expected_exec_cnt): ...
Import
import sys
sys.path.insert(0, "/path/to/qa/common")
from sequence_util import SequenceBatcherTest
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | string | Yes | Name of the sequence model to test |
| dtype | numpy.dtype | Yes | Data type for sequence input tensors |
| steps | list[tuple] | Yes | List of (input_value, flags) tuples representing sequence steps |
| expected_result | number | Yes | Expected accumulated result after all sequence steps |
| sequence_id | int/string | Yes | Correlation ID for the sequence |
| timeout_ms | int | No | Sequence idle timeout in milliseconds (0 for no timeout) |
Outputs
| Name | Type | Description |
|---|---|---|
| assertion_results | None | Raises AssertionError via unittest if accumulated result does not match expected |
| status_check | None | Validates model execution statistics match expected batch counts |
Usage Examples
Basic Sequence Test
from sequence_util import SequenceBatcherTest
class MySequenceTest(SequenceBatcherTest):
def test_simple_sequence(self):
self.check_sequence("graphdef", "simple_sequence", np.int32,
(1,), [(1, FLAG_START), (2, 0), (3, FLAG_END)],
expected_result=6, sequence_id=1001)
Async Concurrent Sequences
threads = []
for seq_id in range(1, 5):
t = threading.Thread(target=self.check_sequence_async,
args=("onnx", "seq_model", np.float32, (1,),
[(1.0, FLAG_START), (2.0, FLAG_END)],
3.0, seq_id))
threads.append(t)
t.start()