Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server L0 Model Namespacing Test

From Leeroopedia


L0 Model Namespacing Test

Source File: qa/L0_model_namespacing/test.py
Language: Python (361 lines)
Domains: Testing, Multi_Tenancy

Purpose

This Python test module validates Triton Inference Server's model namespacing feature for multi-tenant deployments. Model namespacing allows multiple model repositories to contain models with the same name while keeping them isolated within their respective namespaces. The tests verify correct behavior for both poll-based and explicit model control modes, including scenarios with no duplication, model name collisions, ensemble name collisions, and dynamic model resolution at runtime.

Signature

# Key classes and functions:
class AddSubChecker:
    """Inference checker for add/sub models (OUTPUT0 = INPUT0 + INPUT1, OUTPUT1 = INPUT0 - INPUT1)"""
    def __init__(self, checker_client=None)
    def infer(self, model)

class SubAddChecker(AddSubChecker):
    """Inference checker for sub/add models (OUTPUT0 = INPUT0 - INPUT1, OUTPUT1 = INPUT0 + INPUT1)"""
    def infer(self, model)

class ModelNamespacePoll(tu.TestResultCollector):
    """Tests for poll-based model control with namespacing"""
    def test_no_duplication(self)
    def test_duplication(self)
    def test_ensemble_duplication(self)
    def test_dynamic_resolution(self)

class ModelNamespaceExplicit(tu.TestResultCollector):
    """Tests for explicit model control with namespacing"""
    def test_no_duplication(self)
    def test_duplication(self)
    def test_ensemble_duplication(self)
    def test_dynamic_resolution(self)

Key Components

Inference Checkers

Two helper classes perform inference and validate outputs:

class AddSubChecker:
    def __init__(self, checker_client=None):
        self.inputs_ = []
        self.inputs_.append(checker_client.InferInput("INPUT0", [16], "INT32"))
        self.inputs_.append(checker_client.InferInput("INPUT1", [16], "INT32"))
        input_data = np.arange(start=0, stop=16, dtype=np.int32)
        self.expected_outputs_ = {
            "add": (input_data + input_data),
            "sub": (input_data - input_data),
        }

    def infer(self, model):
        res = self.client_.infer(model, self.inputs_)
        np.testing.assert_allclose(res.as_numpy("OUTPUT0"), self.expected_outputs_["add"])
        np.testing.assert_allclose(res.as_numpy("OUTPUT1"), self.expected_outputs_["sub"])

SubAddChecker inverts the expected outputs (OUTPUT0 = sub, OUTPUT1 = add).

Poll-Based Tests (ModelNamespacePoll)

Test Description
test_no_duplication Validates that namespacing works on repositories with unique model names. All models (simple_addsub, composing_addsub, simple_subadd, composing_subadd) are individually accessible.
test_duplication Two repos each have a composing_model with different behavior. Ensembles correctly use their repo's version. Direct inference on composing_model returns an ambiguity error.
test_ensemble_duplication Two repos share an ensemble name (simple_ensemble). Each correctly routes to its composing model. Direct inference on simple_ensemble returns an ambiguity error.
test_dynamic_resolution Removes composing_model from one repo at runtime. Both ensembles fall back to the remaining repo's version. Re-adding restores original behavior.

Explicit Control Tests (ModelNamespaceExplicit)

Similar test cases as poll-based but using explicit model load/unload. Models are loaded via self.client_.load_model(model) with cascading dependency resolution. The dynamic resolution test triggers re-resolution through explicit model reload.

def test_dynamic_resolution(self):
    # Step 1: Remove composing_model from addsub_repo
    shutil.move(composing_before_path, composing_after_path)
    for model in ["simple_addsub", "simple_subadd"]:
        self.client_.load_model(model)
    # Both ensembles now use subadd_repo's composing_model
    for model in ["simple_subadd", "simple_addsub", "composing_model"]:
        self.subadd_.infer(model)

    # Step 2: Restore composing_model to addsub_repo
    shutil.move(composing_after_path, composing_before_path)
    for model in ["simple_addsub"]:
        self.client_.load_model(model)  # Triggers cascading re-load
    # Original behavior restored

Ambiguity Error Handling

When a model name exists in multiple namespaces, direct inference attempts produce an InferenceServerException containing "ambiguity" in the error message:

try:
    self.addsub_.infer("composing_model")
    self.assertTrue(False, "expected error for inferring ambiguous named model")
except InferenceServerException as ex:
    self.assertIn("ambiguity", ex.message())

Test Flow

  1. Initialize inference checkers (AddSubChecker, SubAddChecker) and HTTP client
  2. Verify server health (live and ready)
  3. Execute test-specific model operations (load, infer, validate)
  4. For dynamic resolution tests, manipulate model files at runtime and re-verify

Dependencies

  • tritonclient.http - Triton HTTP client library
  • numpy - Array operations and assertions
  • test_util (tu) - Common test result collector
  • TRITON_QA_ROOT_DIR environment variable for test utilities path
  • NAMESPACE_TESTING_DIRCTORY environment variable for dynamic resolution test file paths

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment