Implementation:Triton inference server Server Model Repository Directory Convention
| Knowledge Sources | |
|---|---|
| Domains | MLOps, Model_Serving |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete filesystem convention for organizing model artifacts into Triton Inference Server's required directory hierarchy.
Description
The Model Repository Directory Convention defines the exact filesystem layout that Triton Inference Server expects when loading models. Each model occupies a named directory containing numbered version subdirectories and an optional config.pbtxt configuration file. The server scans this directory structure at startup (or on poll) to discover and load models.
Usage
Use this convention whenever preparing models for deployment on Triton Inference Server. This is the first step in any Triton deployment workflow — before launching the server, the model repository must be correctly structured.
Code Reference
Source Location
- Repository: triton-inference-server/server
- File: docs/user_guide/model_repository.md
- Lines: L36-72 (Repository Layout), L267-378 (Model file naming per backend)
Signature
# Create model repository with required hierarchy
mkdir -p <model-repository-path>/<model-name>/<version>/
# Example: Deploy an ONNX model
mkdir -p models/densenet_onnx/1/
cp model.onnx models/densenet_onnx/1/model.onnx
# Optional: Add configuration
cat > models/densenet_onnx/config.pbtxt << 'EOF'
name: "densenet_onnx"
platform: "onnxruntime_onnx"
max_batch_size: 0
input [
{ name: "data_0", data_type: TYPE_FP32, dims: [ 3, 224, 224 ] }
]
output [
{ name: "fc6_1", data_type: TYPE_FP32, dims: [ 1000 ] }
]
EOF
Import
# No import needed — this is a filesystem convention
# Consumed by: tritonserver --model-repository=<path>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model-repository-path | string (filesystem path) | Yes | Root directory for all models |
| model-name | string (directory name) | Yes | Name of the model (becomes the model identifier) |
| version | integer (directory name) | Yes | Numeric version directory (e.g., 1, 2, 3) |
| model-definition-file | file | Yes | Backend-specific model file (model.onnx, model.plan, model.pt, model.py) |
| config.pbtxt | protobuf text file | No | Model configuration (optional for auto-complete backends) |
Outputs
| Name | Type | Description |
|---|---|---|
| model repository | directory tree | Well-formed directory hierarchy readable by tritonserver |
| model identity | string | Model name derived from directory name |
| model versions | integer set | Available versions derived from numeric subdirectories |
Usage Examples
Minimal ONNX Model Repository
# 1. Create the repository structure
mkdir -p model_repository/simple_onnx/1/
# 2. Copy the model file (backend auto-detects from file extension)
cp my_model.onnx model_repository/simple_onnx/1/model.onnx
# 3. Launch server (config.pbtxt auto-generated for ONNX)
tritonserver --model-repository=model_repository
Multi-Model Repository with Multiple Versions
# Create repository with two models, each with two versions
mkdir -p model_repository/text_classifier/1/
mkdir -p model_repository/text_classifier/2/
mkdir -p model_repository/image_detector/1/
cp classifier_v1.onnx model_repository/text_classifier/1/model.onnx
cp classifier_v2.onnx model_repository/text_classifier/2/model.onnx
cp detector.plan model_repository/image_detector/1/model.plan
# Add explicit configuration for TensorRT model
cat > model_repository/image_detector/config.pbtxt << 'EOF'
name: "image_detector"
platform: "tensorrt_plan"
max_batch_size: 8
input [
{ name: "input", data_type: TYPE_FP32, dims: [ 3, 640, 640 ] }
]
output [
{ name: "detections", data_type: TYPE_FP32, dims: [ 100, 6 ] }
]
EOF
Cloud Storage Repository (S3)
# Triton supports S3, GCS, and Azure Blob Storage
# Same directory structure, different path prefix
tritonserver --model-repository=s3://my-bucket/models
# Structure in S3:
# s3://my-bucket/models/
# model_a/
# config.pbtxt
# 1/
# model.onnx