Principle:Pytorch Serve Model Archiving

Overview

Model Archiving is the principle of packaging model artifacts -- weights, handler code, configuration, and dependency specifications -- into a self-contained, portable archive format (.mar) for reproducible deployment. The Model Archive (.mar) is the fundamental deployment unit in TorchServe, ensuring that all artifacts required to serve a model travel together as a single immutable package.

Field	Value
Principle Name	Model Archiving
Workflow	Model_Deployment
Domains	Model_Packaging, DevOps
Knowledge Sources	TorchServe
Last Updated	2026-02-13 00:00 GMT

Description

The model archive format addresses a fundamental challenge in ML deployment: ensuring that all components needed to serve a model are correctly bundled, versioned, and transportable. Without archiving, deploying a model requires manually coordinating model weights, handler scripts, configuration files, label mappings, and Python dependencies across environments -- a process prone to drift and failure.

Archive Structure

A .mar file is a ZIP archive with the following structure:

my_model.mar
  +-- MAR-INF/
  |     +-- MANIFEST.json       # Metadata: model name, version, handler, runtime
  +-- model.pt                  # Serialized model weights (TorchScript, state_dict, ONNX, .so)
  +-- model.py                  # (Optional) Model class definition for eager mode
  +-- handler.py                # Inference handler (or reference to built-in handler)
  +-- model_config.yaml         # (Optional) YAML serving configuration
  +-- index_to_name.json        # (Optional) Class label mapping
  +-- requirements.txt          # (Optional) Python dependencies
  +-- extra_file_1.json         # (Optional) Additional files (tokenizer configs, etc.)

MANIFEST.json

The manifest is auto-generated during archiving and contains:

{
  "createdOn": "2026-02-13T00:00:00Z",
  "runtime": "python",
  "model": {
    "modelName": "my_model",
    "serializedFile": "model.pt",
    "handler": "handler.py",
    "modelFile": "model.py",
    "modelVersion": "1.0",
    "configFile": "model_config.yaml"
  },
  "archiverVersion": "0.11.1"
}

Archive Formats

The archiver supports three output formats:

Format	Flag Value	Description	Use Case
Default	`default`	ZIP archive with `.mar` extension	Standard deployment, model store
TGZ	`tgz`	Gzipped tar archive	Integration with container pipelines
No Archive	`no-archive`	Flat directory (no compression)	Development and debugging

Key Design Decisions

Immutability: Once created, a .mar file is an immutable snapshot. Redeployment requires creating a new archive, preventing in-place modification that could cause inconsistencies.
Self-Containment: All artifacts needed to serve the model are inside the archive. The only external dependency is the Python runtime and packages specified in requirements.txt.
Manifest-Driven: The MANIFEST.json provides a machine-readable description of the archive contents, enabling the serving infrastructure to automatically discover the handler, model file, and configuration without convention-based assumptions.
Force Overwrite Protection: By default, the archiver refuses to overwrite an existing .mar file. The --force flag must be explicitly provided, preventing accidental overwrite of production archives.

Usage

Command-Line Interface

# Basic model archiving
torch-model-archiver \
  --model-name resnet18 \
  --version 1.0 \
  --serialized-file resnet18.pt \
  --handler image_classifier \
  --export-path model_store/

# Full archiving with all options
torch-model-archiver \
  --model-name bert_classifier \
  --version 2.0 \
  --model-file model.py \
  --serialized-file bert_weights.pt \
  --handler handler.py \
  --extra-files "tokenizer_config.json,vocab.txt,index_to_name.json" \
  --config-file model_config.yaml \
  --requirements-file requirements.txt \
  --export-path model_store/ \
  --archive-format default \
  --force

Programmatic API

from model_archiver import ModelArchiverConfig
from model_archiver.model_packaging import generate_model_archive

config = ModelArchiverConfig(
    model_name="resnet18",
    handler="image_classifier",
    version="1.0",
    serialized_file="resnet18.pt",
    export_path="model_store/",
    force=True,
)

generate_model_archive(config)

Deployment Workflow

Train the model and save weights.
Archive the model with torch-model-archiver.
Deploy by placing the .mar in the model store directory.
Register the model via the Management API or at server startup.

Theoretical Basis

Immutable Artifact Pattern

Model archiving follows the Immutable Artifact pattern from continuous delivery. Each archive is a versioned, immutable build artifact that flows through the deployment pipeline unchanged. This ensures:

Reproducibility: The same .mar file produces the same serving behavior regardless of when or where it is deployed.
Traceability: Each archive can be traced back to its source artifacts via the manifest version.
Rollback Safety: Previous versions of the archive can be redeployed instantly.

Self-Contained Deployment Unit

The .mar format is analogous to container images (Docker) or serverless deployment packages (AWS Lambda ZIP). It bundles code, data, and configuration into a single unit, reducing the configuration surface area that must be managed during deployment.

Separation of Build and Runtime

The archiver creates a clear boundary between the build phase (training, serialization, packaging) and the runtime phase (serving). This separation enables:

Different teams to own build vs. runtime.
CI/CD pipelines to validate archives before deployment.
Model registries to store and version archives independently of the serving infrastructure.

Related Pages

Implementation:Pytorch_Serve_Generate_Model_Archive - The generate_model_archive() function that creates archives
Principle:Pytorch_Serve_Model_Artifact_Configuration - YAML config files bundled inside archives
Principle:Pytorch_Serve_Inference_Handler_Development - Handlers packaged within archives
Principle:Pytorch_Serve_Model_Registration - Registering archived models on a running server

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment