Implementation:Triton inference server Server L0 Model Namespacing Test
L0 Model Namespacing Test
Source File: qa/L0_model_namespacing/test.py
Language: Python (361 lines)
Domains: Testing, Multi_Tenancy
Purpose
This Python test module validates Triton Inference Server's model namespacing feature for multi-tenant deployments. Model namespacing allows multiple model repositories to contain models with the same name while keeping them isolated within their respective namespaces. The tests verify correct behavior for both poll-based and explicit model control modes, including scenarios with no duplication, model name collisions, ensemble name collisions, and dynamic model resolution at runtime.
Signature
# Key classes and functions:
class AddSubChecker:
"""Inference checker for add/sub models (OUTPUT0 = INPUT0 + INPUT1, OUTPUT1 = INPUT0 - INPUT1)"""
def __init__(self, checker_client=None)
def infer(self, model)
class SubAddChecker(AddSubChecker):
"""Inference checker for sub/add models (OUTPUT0 = INPUT0 - INPUT1, OUTPUT1 = INPUT0 + INPUT1)"""
def infer(self, model)
class ModelNamespacePoll(tu.TestResultCollector):
"""Tests for poll-based model control with namespacing"""
def test_no_duplication(self)
def test_duplication(self)
def test_ensemble_duplication(self)
def test_dynamic_resolution(self)
class ModelNamespaceExplicit(tu.TestResultCollector):
"""Tests for explicit model control with namespacing"""
def test_no_duplication(self)
def test_duplication(self)
def test_ensemble_duplication(self)
def test_dynamic_resolution(self)
Key Components
Inference Checkers
Two helper classes perform inference and validate outputs:
class AddSubChecker:
def __init__(self, checker_client=None):
self.inputs_ = []
self.inputs_.append(checker_client.InferInput("INPUT0", [16], "INT32"))
self.inputs_.append(checker_client.InferInput("INPUT1", [16], "INT32"))
input_data = np.arange(start=0, stop=16, dtype=np.int32)
self.expected_outputs_ = {
"add": (input_data + input_data),
"sub": (input_data - input_data),
}
def infer(self, model):
res = self.client_.infer(model, self.inputs_)
np.testing.assert_allclose(res.as_numpy("OUTPUT0"), self.expected_outputs_["add"])
np.testing.assert_allclose(res.as_numpy("OUTPUT1"), self.expected_outputs_["sub"])
SubAddChecker inverts the expected outputs (OUTPUT0 = sub, OUTPUT1 = add).
Poll-Based Tests (ModelNamespacePoll)
| Test | Description |
|---|---|
| test_no_duplication | Validates that namespacing works on repositories with unique model names. All models (simple_addsub, composing_addsub, simple_subadd, composing_subadd) are individually accessible.
|
| test_duplication | Two repos each have a composing_model with different behavior. Ensembles correctly use their repo's version. Direct inference on composing_model returns an ambiguity error.
|
| test_ensemble_duplication | Two repos share an ensemble name (simple_ensemble). Each correctly routes to its composing model. Direct inference on simple_ensemble returns an ambiguity error.
|
| test_dynamic_resolution | Removes composing_model from one repo at runtime. Both ensembles fall back to the remaining repo's version. Re-adding restores original behavior.
|
Explicit Control Tests (ModelNamespaceExplicit)
Similar test cases as poll-based but using explicit model load/unload. Models are loaded via self.client_.load_model(model) with cascading dependency resolution. The dynamic resolution test triggers re-resolution through explicit model reload.
def test_dynamic_resolution(self):
# Step 1: Remove composing_model from addsub_repo
shutil.move(composing_before_path, composing_after_path)
for model in ["simple_addsub", "simple_subadd"]:
self.client_.load_model(model)
# Both ensembles now use subadd_repo's composing_model
for model in ["simple_subadd", "simple_addsub", "composing_model"]:
self.subadd_.infer(model)
# Step 2: Restore composing_model to addsub_repo
shutil.move(composing_after_path, composing_before_path)
for model in ["simple_addsub"]:
self.client_.load_model(model) # Triggers cascading re-load
# Original behavior restored
Ambiguity Error Handling
When a model name exists in multiple namespaces, direct inference attempts produce an InferenceServerException containing "ambiguity" in the error message:
try:
self.addsub_.infer("composing_model")
self.assertTrue(False, "expected error for inferring ambiguous named model")
except InferenceServerException as ex:
self.assertIn("ambiguity", ex.message())
Test Flow
- Initialize inference checkers (AddSubChecker, SubAddChecker) and HTTP client
- Verify server health (live and ready)
- Execute test-specific model operations (load, infer, validate)
- For dynamic resolution tests, manipulate model files at runtime and re-verify
Dependencies
tritonclient.http- Triton HTTP client librarynumpy- Array operations and assertionstest_util(tu) - Common test result collectorTRITON_QA_ROOT_DIRenvironment variable for test utilities pathNAMESPACE_TESTING_DIRCTORYenvironment variable for dynamic resolution test file paths