Principle:Pytorch Serve Model Archiving
Overview
Model Archiving is the principle of packaging model artifacts -- weights, handler code, configuration, and dependency specifications -- into a self-contained, portable archive format (.mar) for reproducible deployment. The Model Archive (.mar) is the fundamental deployment unit in TorchServe, ensuring that all artifacts required to serve a model travel together as a single immutable package.
| Field | Value |
|---|---|
| Principle Name | Model Archiving |
| Workflow | Model_Deployment |
| Domains | Model_Packaging, DevOps |
| Knowledge Sources | TorchServe |
| Last Updated | 2026-02-13 00:00 GMT |
Description
The model archive format addresses a fundamental challenge in ML deployment: ensuring that all components needed to serve a model are correctly bundled, versioned, and transportable. Without archiving, deploying a model requires manually coordinating model weights, handler scripts, configuration files, label mappings, and Python dependencies across environments -- a process prone to drift and failure.
Archive Structure
A .mar file is a ZIP archive with the following structure:
my_model.mar
+-- MAR-INF/
| +-- MANIFEST.json # Metadata: model name, version, handler, runtime
+-- model.pt # Serialized model weights (TorchScript, state_dict, ONNX, .so)
+-- model.py # (Optional) Model class definition for eager mode
+-- handler.py # Inference handler (or reference to built-in handler)
+-- model_config.yaml # (Optional) YAML serving configuration
+-- index_to_name.json # (Optional) Class label mapping
+-- requirements.txt # (Optional) Python dependencies
+-- extra_file_1.json # (Optional) Additional files (tokenizer configs, etc.)
MANIFEST.json
The manifest is auto-generated during archiving and contains:
{
"createdOn": "2026-02-13T00:00:00Z",
"runtime": "python",
"model": {
"modelName": "my_model",
"serializedFile": "model.pt",
"handler": "handler.py",
"modelFile": "model.py",
"modelVersion": "1.0",
"configFile": "model_config.yaml"
},
"archiverVersion": "0.11.1"
}
Archive Formats
The archiver supports three output formats:
| Format | Flag Value | Description | Use Case |
|---|---|---|---|
| Default | default |
ZIP archive with .mar extension |
Standard deployment, model store |
| TGZ | tgz |
Gzipped tar archive | Integration with container pipelines |
| No Archive | no-archive |
Flat directory (no compression) | Development and debugging |
Key Design Decisions
- Immutability: Once created, a
.marfile is an immutable snapshot. Redeployment requires creating a new archive, preventing in-place modification that could cause inconsistencies. - Self-Containment: All artifacts needed to serve the model are inside the archive. The only external dependency is the Python runtime and packages specified in
requirements.txt. - Manifest-Driven: The
MANIFEST.jsonprovides a machine-readable description of the archive contents, enabling the serving infrastructure to automatically discover the handler, model file, and configuration without convention-based assumptions. - Force Overwrite Protection: By default, the archiver refuses to overwrite an existing
.marfile. The--forceflag must be explicitly provided, preventing accidental overwrite of production archives.
Usage
Command-Line Interface
# Basic model archiving
torch-model-archiver \
--model-name resnet18 \
--version 1.0 \
--serialized-file resnet18.pt \
--handler image_classifier \
--export-path model_store/
# Full archiving with all options
torch-model-archiver \
--model-name bert_classifier \
--version 2.0 \
--model-file model.py \
--serialized-file bert_weights.pt \
--handler handler.py \
--extra-files "tokenizer_config.json,vocab.txt,index_to_name.json" \
--config-file model_config.yaml \
--requirements-file requirements.txt \
--export-path model_store/ \
--archive-format default \
--force
Programmatic API
from model_archiver import ModelArchiverConfig
from model_archiver.model_packaging import generate_model_archive
config = ModelArchiverConfig(
model_name="resnet18",
handler="image_classifier",
version="1.0",
serialized_file="resnet18.pt",
export_path="model_store/",
force=True,
)
generate_model_archive(config)
Deployment Workflow
- Train the model and save weights.
- Archive the model with
torch-model-archiver. - Deploy by placing the
.marin the model store directory. - Register the model via the Management API or at server startup.
Theoretical Basis
Immutable Artifact Pattern
Model archiving follows the Immutable Artifact pattern from continuous delivery. Each archive is a versioned, immutable build artifact that flows through the deployment pipeline unchanged. This ensures:
- Reproducibility: The same
.marfile produces the same serving behavior regardless of when or where it is deployed. - Traceability: Each archive can be traced back to its source artifacts via the manifest version.
- Rollback Safety: Previous versions of the archive can be redeployed instantly.
Self-Contained Deployment Unit
The .mar format is analogous to container images (Docker) or serverless deployment packages (AWS Lambda ZIP). It bundles code, data, and configuration into a single unit, reducing the configuration surface area that must be managed during deployment.
Separation of Build and Runtime
The archiver creates a clear boundary between the build phase (training, serialization, packaging) and the runtime phase (serving). This separation enables:
- Different teams to own build vs. runtime.
- CI/CD pipelines to validate archives before deployment.
- Model registries to store and version archives independently of the serving infrastructure.
Related Pages
- Implementation:Pytorch_Serve_Generate_Model_Archive - The
generate_model_archive()function that creates archives - Principle:Pytorch_Serve_Model_Artifact_Configuration - YAML config files bundled inside archives
- Principle:Pytorch_Serve_Inference_Handler_Development - Handlers packaged within archives
- Principle:Pytorch_Serve_Model_Registration - Registering archived models on a running server