Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Onnx Onnx Protobuf 2GB Limit Workaround

From Leeroopedia




Knowledge Sources
Domains Serialization, Large_Models
Last Updated 2026-02-10 02:00 GMT

Overview

Workaround for the Protocol Buffers 2GB serialization limit by using external data storage for large ONNX models.

Description

Protocol Buffers has a hard limit of 2 GiB (2,147,483,648 bytes) for a single serialized message. Since ONNX models store tensor weights inline in the ModelProto by default, large models (e.g., LLMs with billions of parameters) easily exceed this limit. The ONNX library provides external data storage as the standard workaround: tensor weights are written to separate files on disk, and the ModelProto stores only references (file path, offset, length) to these external files. This approach allows models of any size to be serialized and loaded correctly.

Usage

Use this heuristic when:

  • Saving a model that has total tensor data approaching or exceeding 2GB
  • Getting a `ValueError` with message "The proto size is larger than the 2 GB limit"
  • Building large models programmatically (use `ModelContainer` to defer serialization)
  • Validating large models (use `check_model` with a file path, not bytes)
  • Running shape inference on large models (use `infer_shapes_path` instead of `infer_shapes`)

The Insight (Rule of Thumb)

  • Action: Use `onnx.save_model()` with `save_as_external_data=True` for any model that might exceed 2GB. Use file-path-based APIs (`check_model(path)`, `infer_shapes_path()`) for large model validation and inference.
  • Value: The `MAXIMUM_PROTOBUF` constant is 2,147,483,648 bytes (2 GiB).
  • Trade-off: External data adds file I/O overhead and requires the external data files to remain alongside the .onnx file at known relative paths. Models are no longer self-contained single files.
  • Alternative: Use `ModelContainer` class to build large models in-memory without premature serialization, then save with external data at the end.

Reasoning

The 2GB limit is a fundamental Protocol Buffers constraint, not an ONNX design choice. Since ONNX uses protobuf for serialization, this limit cascades to all ONNX models. The external data mechanism was introduced specifically to handle modern deep learning models (LLMs, large vision models) whose weights far exceed 2GB. The checker, shape inference, and version converter all have path-based APIs that stream data from disk instead of loading the entire model into a single protobuf message.

The `extract_model` utility also automatically handles this: when an extracted sub-model exceeds 2GB, it saves external data to `output_path.data` automatically.

Code evidence from `onnx/checker.py:35-36`:

# Limitation of single protobuf file is 2GiB
MAXIMUM_PROTOBUF = 2147483648

Error handling from `onnx/serialization.py:104-109`:

try:
    result = proto.SerializeToString()
except ValueError as e:
    if proto.ByteSize() >= onnx.checker.MAXIMUM_PROTOBUF:
        raise ValueError(
            "The proto size is larger than the 2 GB limit. "
            "Please use save_as_external_data to save tensors separately from the model file."
        ) from e

Path-based checking from `onnx/checker.py:137-139` (docstring):

# If model is a path, the function checks model path first.
# If the model bytes size is larger than 2GB, function should be
# called using model path.

ModelContainer purpose from `onnx/model_container.py:4-5`:

# Implements an API to store large tensors outside the main ModelProto,
# it avoids copying large initializers when defining the model...

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment