Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Triton inference server Server Model Repository Layout

From Leeroopedia
Knowledge Sources
Domains MLOps, Model_Serving
Last Updated 2026-02-13 17:00 GMT

Overview

A filesystem convention that organizes model artifacts into a hierarchical directory structure for discovery and versioning by an inference server.

Description

The Model Repository Layout principle defines a standard directory hierarchy where each model occupies a named subdirectory containing numbered version folders and an optional configuration file. This convention allows inference servers to automatically discover, load, and version models without explicit registration APIs. The pattern supports multiple storage backends (local filesystem, cloud object stores like S3, GCS, and Azure Blob Storage) through a uniform directory structure.

The key insight is that directory names become model identifiers and numeric subdirectories become version identifiers, enabling convention-over-configuration model management. This eliminates the need for a separate model registry in simple deployments.

Usage

Use this principle when deploying models to any inference serving system that uses filesystem-based model discovery. It is the foundational organizational pattern for Triton Inference Server and must be established before any model can be served. Apply this when setting up new model repositories on local storage or cloud object stores.

Theoretical Basis

The repository follows a strict three-level hierarchy:

<model-repository-path>/
    <model-name>/
        [config.pbtxt]
        [<output-labels-file> ...]
        <version>/
            <model-definition-file>
        <version>/
            <model-definition-file>
        ...
    <model-name>/
        [config.pbtxt]
        [<output-labels-file> ...]
        <version>/
            <model-definition-file>
        ...
    ...

Key rules:

  • The top-level directory is the model repository root
  • Each subdirectory at the first level is a model name
  • Each numeric subdirectory (e.g., 1, 2) is a model version
  • Model definition files have backend-specific names: model.onnx (ONNX), model.plan (TensorRT), model.pt (PyTorch), model.py (Python backend)
  • config.pbtxt is optional for some backends that support auto-completion

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment