Workflow:Onnx Onnx External Data Handling

Knowledge Sources	ONNX External Data Guide Python API Overview
Domains	ML_Infrastructure, Large_Model_Support
Last Updated	2026-02-10 02:30 GMT

Overview

End-to-end process for managing large ONNX models by converting tensor data to external files and loading models with externally stored weights.

Description

This workflow covers the procedure for handling ONNX models whose tensor data (weights, biases, embeddings) exceeds the 2GB protobuf serialization limit or is too large to embed efficiently within the model file. The external data mechanism stores tensor raw data in separate binary files referenced by the model's TensorProto entries. This enables working with models of arbitrary size while keeping the model structure file compact. The workflow covers both directions: converting an in-memory model to use external data, and loading a model that already uses external data files.

Usage

Execute this workflow when you need to:

Save a model with weights exceeding 2GB that cannot be serialized as a single protobuf
Reduce the size of the model structure file by storing large tensors externally
Load and manipulate models that use external data storage
Convert between inline and external data representations for deployment flexibility
Share models where the structure and weights need to be managed separately

Execution Steps

Step 1: Load the Model

Load the ONNX model into memory. If the model already uses external data, the loader resolves external data file references relative to the model file's directory. For models with external data in a non-standard location, disable automatic external data loading and specify the data directory explicitly.

Key considerations:

By default, external data is loaded automatically from the same directory as the model file
Set load_external_data=False to load only the model structure without resolving tensor data
Use load_external_data_for_model with an explicit base directory path when data files are stored elsewhere
External data files are referenced by relative paths from the model file location

Step 2: Convert to External Data Format

Transform in-memory tensor data into external data references. This step modifies the model's TensorProto entries to point to external files instead of containing raw data inline. A size threshold controls which tensors are externalized (only tensors at or above the threshold are converted).

Key considerations:

The all_tensors_to_one_file parameter controls whether all tensors share a single external file or each gets its own file
The location parameter specifies the external file name (relative to the output model path)
The size_threshold parameter (in bytes) determines the minimum tensor size for externalization; set to 0 to externalize everything
The convert_attribute parameter controls whether attribute tensors (e.g., in Constant nodes) are also externalized

Step 3: Save with External Data

Persist the model and its external data files to disk. The save function writes the tensor data to the specified external files and saves the model structure with external data references.

Key considerations:

The save_as_external_data parameter on save_model combines conversion and saving into a single call
External data files are written to the same directory as the model file
Ensure the output directory exists and has sufficient disk space for the external data
The model file itself remains small, containing only the graph structure and external data pointers

Step 4: Verify External Data Integrity

Reload the saved model to verify that external data references resolve correctly and the model validates successfully. Run the checker against the model path (not the in-memory proto) to validate models exceeding 2GB.

Key considerations:

Use onnx.checker.check_model with the file path for large models rather than the in-memory ModelProto
Use onnx.shape_inference.infer_shapes_path for shape inference on large models
Verify that the external data files exist at the expected relative paths
The ModelContainer class provides an alternative for managing large models with multiple external data blobs

Execution Diagram

GitHub URL

Workflow Repository