Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server Model Repository Directory Convention

From Leeroopedia
Knowledge Sources
Domains MLOps, Model_Serving
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete filesystem convention for organizing model artifacts into Triton Inference Server's required directory hierarchy.

Description

The Model Repository Directory Convention defines the exact filesystem layout that Triton Inference Server expects when loading models. Each model occupies a named directory containing numbered version subdirectories and an optional config.pbtxt configuration file. The server scans this directory structure at startup (or on poll) to discover and load models.

Usage

Use this convention whenever preparing models for deployment on Triton Inference Server. This is the first step in any Triton deployment workflow — before launching the server, the model repository must be correctly structured.

Code Reference

Source Location

  • Repository: triton-inference-server/server
  • File: docs/user_guide/model_repository.md
  • Lines: L36-72 (Repository Layout), L267-378 (Model file naming per backend)

Signature

# Create model repository with required hierarchy
mkdir -p <model-repository-path>/<model-name>/<version>/

# Example: Deploy an ONNX model
mkdir -p models/densenet_onnx/1/
cp model.onnx models/densenet_onnx/1/model.onnx

# Optional: Add configuration
cat > models/densenet_onnx/config.pbtxt << 'EOF'
name: "densenet_onnx"
platform: "onnxruntime_onnx"
max_batch_size: 0
input [
  { name: "data_0", data_type: TYPE_FP32, dims: [ 3, 224, 224 ] }
]
output [
  { name: "fc6_1", data_type: TYPE_FP32, dims: [ 1000 ] }
]
EOF

Import

# No import needed — this is a filesystem convention
# Consumed by: tritonserver --model-repository=<path>

I/O Contract

Inputs

Name Type Required Description
model-repository-path string (filesystem path) Yes Root directory for all models
model-name string (directory name) Yes Name of the model (becomes the model identifier)
version integer (directory name) Yes Numeric version directory (e.g., 1, 2, 3)
model-definition-file file Yes Backend-specific model file (model.onnx, model.plan, model.pt, model.py)
config.pbtxt protobuf text file No Model configuration (optional for auto-complete backends)

Outputs

Name Type Description
model repository directory tree Well-formed directory hierarchy readable by tritonserver
model identity string Model name derived from directory name
model versions integer set Available versions derived from numeric subdirectories

Usage Examples

Minimal ONNX Model Repository

# 1. Create the repository structure
mkdir -p model_repository/simple_onnx/1/

# 2. Copy the model file (backend auto-detects from file extension)
cp my_model.onnx model_repository/simple_onnx/1/model.onnx

# 3. Launch server (config.pbtxt auto-generated for ONNX)
tritonserver --model-repository=model_repository

Multi-Model Repository with Multiple Versions

# Create repository with two models, each with two versions
mkdir -p model_repository/text_classifier/1/
mkdir -p model_repository/text_classifier/2/
mkdir -p model_repository/image_detector/1/

cp classifier_v1.onnx model_repository/text_classifier/1/model.onnx
cp classifier_v2.onnx model_repository/text_classifier/2/model.onnx
cp detector.plan model_repository/image_detector/1/model.plan

# Add explicit configuration for TensorRT model
cat > model_repository/image_detector/config.pbtxt << 'EOF'
name: "image_detector"
platform: "tensorrt_plan"
max_batch_size: 8
input [
  { name: "input", data_type: TYPE_FP32, dims: [ 3, 640, 640 ] }
]
output [
  { name: "detections", data_type: TYPE_FP32, dims: [ 100, 6 ] }
]
EOF

Cloud Storage Repository (S3)

# Triton supports S3, GCS, and Azure Blob Storage
# Same directory structure, different path prefix
tritonserver --model-repository=s3://my-bucket/models

# Structure in S3:
# s3://my-bucket/models/
#     model_a/
#         config.pbtxt
#         1/
#             model.onnx

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment