Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Pytorch Serve Model Archiving

From Leeroopedia

Overview

Model Archiving is the principle of packaging model artifacts -- weights, handler code, configuration, and dependency specifications -- into a self-contained, portable archive format (.mar) for reproducible deployment. The Model Archive (.mar) is the fundamental deployment unit in TorchServe, ensuring that all artifacts required to serve a model travel together as a single immutable package.

Field Value
Principle Name Model Archiving
Workflow Model_Deployment
Domains Model_Packaging, DevOps
Knowledge Sources TorchServe
Last Updated 2026-02-13 00:00 GMT

Description

The model archive format addresses a fundamental challenge in ML deployment: ensuring that all components needed to serve a model are correctly bundled, versioned, and transportable. Without archiving, deploying a model requires manually coordinating model weights, handler scripts, configuration files, label mappings, and Python dependencies across environments -- a process prone to drift and failure.

Archive Structure

A .mar file is a ZIP archive with the following structure:

my_model.mar
  +-- MAR-INF/
  |     +-- MANIFEST.json       # Metadata: model name, version, handler, runtime
  +-- model.pt                  # Serialized model weights (TorchScript, state_dict, ONNX, .so)
  +-- model.py                  # (Optional) Model class definition for eager mode
  +-- handler.py                # Inference handler (or reference to built-in handler)
  +-- model_config.yaml         # (Optional) YAML serving configuration
  +-- index_to_name.json        # (Optional) Class label mapping
  +-- requirements.txt          # (Optional) Python dependencies
  +-- extra_file_1.json         # (Optional) Additional files (tokenizer configs, etc.)

MANIFEST.json

The manifest is auto-generated during archiving and contains:

{
  "createdOn": "2026-02-13T00:00:00Z",
  "runtime": "python",
  "model": {
    "modelName": "my_model",
    "serializedFile": "model.pt",
    "handler": "handler.py",
    "modelFile": "model.py",
    "modelVersion": "1.0",
    "configFile": "model_config.yaml"
  },
  "archiverVersion": "0.11.1"
}

Archive Formats

The archiver supports three output formats:

Format Flag Value Description Use Case
Default default ZIP archive with .mar extension Standard deployment, model store
TGZ tgz Gzipped tar archive Integration with container pipelines
No Archive no-archive Flat directory (no compression) Development and debugging

Key Design Decisions

  • Immutability: Once created, a .mar file is an immutable snapshot. Redeployment requires creating a new archive, preventing in-place modification that could cause inconsistencies.
  • Self-Containment: All artifacts needed to serve the model are inside the archive. The only external dependency is the Python runtime and packages specified in requirements.txt.
  • Manifest-Driven: The MANIFEST.json provides a machine-readable description of the archive contents, enabling the serving infrastructure to automatically discover the handler, model file, and configuration without convention-based assumptions.
  • Force Overwrite Protection: By default, the archiver refuses to overwrite an existing .mar file. The --force flag must be explicitly provided, preventing accidental overwrite of production archives.

Usage

Command-Line Interface

# Basic model archiving
torch-model-archiver \
  --model-name resnet18 \
  --version 1.0 \
  --serialized-file resnet18.pt \
  --handler image_classifier \
  --export-path model_store/

# Full archiving with all options
torch-model-archiver \
  --model-name bert_classifier \
  --version 2.0 \
  --model-file model.py \
  --serialized-file bert_weights.pt \
  --handler handler.py \
  --extra-files "tokenizer_config.json,vocab.txt,index_to_name.json" \
  --config-file model_config.yaml \
  --requirements-file requirements.txt \
  --export-path model_store/ \
  --archive-format default \
  --force

Programmatic API

from model_archiver import ModelArchiverConfig
from model_archiver.model_packaging import generate_model_archive

config = ModelArchiverConfig(
    model_name="resnet18",
    handler="image_classifier",
    version="1.0",
    serialized_file="resnet18.pt",
    export_path="model_store/",
    force=True,
)

generate_model_archive(config)

Deployment Workflow

  1. Train the model and save weights.
  2. Archive the model with torch-model-archiver.
  3. Deploy by placing the .mar in the model store directory.
  4. Register the model via the Management API or at server startup.

Theoretical Basis

Immutable Artifact Pattern

Model archiving follows the Immutable Artifact pattern from continuous delivery. Each archive is a versioned, immutable build artifact that flows through the deployment pipeline unchanged. This ensures:

  • Reproducibility: The same .mar file produces the same serving behavior regardless of when or where it is deployed.
  • Traceability: Each archive can be traced back to its source artifacts via the manifest version.
  • Rollback Safety: Previous versions of the archive can be redeployed instantly.

Self-Contained Deployment Unit

The .mar format is analogous to container images (Docker) or serverless deployment packages (AWS Lambda ZIP). It bundles code, data, and configuration into a single unit, reducing the configuration surface area that must be managed during deployment.

Separation of Build and Runtime

The archiver creates a clear boundary between the build phase (training, serialization, packaging) and the runtime phase (serving). This separation enables:

  • Different teams to own build vs. runtime.
  • CI/CD pipelines to validate archives before deployment.
  • Model registries to store and version archives independently of the serving infrastructure.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment