Implementation:Triton inference server Server L0 Model Namespacing Test

L0 Model Namespacing Test

Source File: qa/L0_model_namespacing/test.py
Language: Python (361 lines)
Domains: Testing, Multi_Tenancy

Purpose

This Python test module validates Triton Inference Server's model namespacing feature for multi-tenant deployments. Model namespacing allows multiple model repositories to contain models with the same name while keeping them isolated within their respective namespaces. The tests verify correct behavior for both poll-based and explicit model control modes, including scenarios with no duplication, model name collisions, ensemble name collisions, and dynamic model resolution at runtime.

Signature

# Key classes and functions:
class AddSubChecker:
    """Inference checker for add/sub models (OUTPUT0 = INPUT0 + INPUT1, OUTPUT1 = INPUT0 - INPUT1)"""
    def __init__(self, checker_client=None)
    def infer(self, model)

class SubAddChecker(AddSubChecker):
    """Inference checker for sub/add models (OUTPUT0 = INPUT0 - INPUT1, OUTPUT1 = INPUT0 + INPUT1)"""
    def infer(self, model)

class ModelNamespacePoll(tu.TestResultCollector):
    """Tests for poll-based model control with namespacing"""
    def test_no_duplication(self)
    def test_duplication(self)
    def test_ensemble_duplication(self)
    def test_dynamic_resolution(self)

class ModelNamespaceExplicit(tu.TestResultCollector):
    """Tests for explicit model control with namespacing"""
    def test_no_duplication(self)
    def test_duplication(self)
    def test_ensemble_duplication(self)
    def test_dynamic_resolution(self)

Key Components

Inference Checkers

Two helper classes perform inference and validate outputs:

class AddSubChecker:
    def __init__(self, checker_client=None):
        self.inputs_ = []
        self.inputs_.append(checker_client.InferInput("INPUT0", [16], "INT32"))
        self.inputs_.append(checker_client.InferInput("INPUT1", [16], "INT32"))
        input_data = np.arange(start=0, stop=16, dtype=np.int32)
        self.expected_outputs_ = {
            "add": (input_data + input_data),
            "sub": (input_data - input_data),
        }

    def infer(self, model):
        res = self.client_.infer(model, self.inputs_)
        np.testing.assert_allclose(res.as_numpy("OUTPUT0"), self.expected_outputs_["add"])
        np.testing.assert_allclose(res.as_numpy("OUTPUT1"), self.expected_outputs_["sub"])

SubAddChecker inverts the expected outputs (OUTPUT0 = sub, OUTPUT1 = add).

Poll-Based Tests (ModelNamespacePoll)

Test	Description
test_no_duplication	Validates that namespacing works on repositories with unique model names. All models (`simple_addsub`, `composing_addsub`, `simple_subadd`, `composing_subadd`) are individually accessible.
test_duplication	Two repos each have a `composing_model` with different behavior. Ensembles correctly use their repo's version. Direct inference on `composing_model` returns an ambiguity error.
test_ensemble_duplication	Two repos share an ensemble name (`simple_ensemble`). Each correctly routes to its composing model. Direct inference on `simple_ensemble` returns an ambiguity error.
test_dynamic_resolution	Removes `composing_model` from one repo at runtime. Both ensembles fall back to the remaining repo's version. Re-adding restores original behavior.

Explicit Control Tests (ModelNamespaceExplicit)

Similar test cases as poll-based but using explicit model load/unload. Models are loaded via self.client_.load_model(model) with cascading dependency resolution. The dynamic resolution test triggers re-resolution through explicit model reload.

def test_dynamic_resolution(self):
    # Step 1: Remove composing_model from addsub_repo
    shutil.move(composing_before_path, composing_after_path)
    for model in ["simple_addsub", "simple_subadd"]:
        self.client_.load_model(model)
    # Both ensembles now use subadd_repo's composing_model
    for model in ["simple_subadd", "simple_addsub", "composing_model"]:
        self.subadd_.infer(model)

    # Step 2: Restore composing_model to addsub_repo
    shutil.move(composing_after_path, composing_before_path)
    for model in ["simple_addsub"]:
        self.client_.load_model(model)  # Triggers cascading re-load
    # Original behavior restored

Ambiguity Error Handling

When a model name exists in multiple namespaces, direct inference attempts produce an InferenceServerException containing "ambiguity" in the error message:

try:
    self.addsub_.infer("composing_model")
    self.assertTrue(False, "expected error for inferring ambiguous named model")
except InferenceServerException as ex:
    self.assertIn("ambiguity", ex.message())

Test Flow

Initialize inference checkers (AddSubChecker, SubAddChecker) and HTTP client
Verify server health (live and ready)
Execute test-specific model operations (load, infer, validate)
For dynamic resolution tests, manipulate model files at runtime and re-verify

Dependencies

tritonclient.http - Triton HTTP client library
numpy - Array operations and assertions
test_util (tu) - Common test result collector
TRITON_QA_ROOT_DIR environment variable for test utilities path
NAMESPACE_TESTING_DIRCTORY environment variable for dynamic resolution test file paths

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment