Implementation:Huggingface Optimum Save Utils
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Serialization, Preprocessing |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Concrete tool for loading and saving model preprocessors (tokenizers, processors, feature extractors, image processors) alongside optimized models provided by the Huggingface Optimum library.
Description
This module provides two functions for managing preprocessor persistence:
- maybe_load_preprocessors — Attempts to load all available preprocessors (AutoTokenizer, AutoProcessor, AutoFeatureExtractor, AutoImageProcessor) from a model directory or Hub repo. Each load attempt is wrapped in a try/except so that missing preprocessor types are silently skipped.
- maybe_save_preprocessors — Loads all available preprocessors from a source location and saves them to a destination directory. This ensures that when exporting or optimizing a model, the associated preprocessors are preserved alongside the model files.
Usage
Use maybe_save_preprocessors during model export or optimization to ensure all associated preprocessors are saved alongside the exported model. This is called internally by model export pipelines.
Code Reference
Source Location
- Repository: Huggingface_Optimum
- File: optimum/utils/save_utils.py
- Lines: 1-93
Signature
def maybe_load_preprocessors(
src_name_or_path: Union[str, Path],
subfolder: str = "",
trust_remote_code: bool = False,
) -> List:
"""Attempt to load all available preprocessors from a model source."""
def maybe_save_preprocessors(
src_name_or_path: Union[str, Path],
dest_dir: Union[str, Path],
src_subfolder: str = "",
trust_remote_code: bool = False,
):
"""Load preprocessors from source and save them to dest_dir.
Args:
src_name_or_path: Model name or path to load preprocessors from.
dest_dir: Directory to save preprocessors to.
src_subfolder: Subfolder within the model directory.
trust_remote_code: Whether to allow running arbitrary code.
"""
Import
from optimum.utils.save_utils import maybe_save_preprocessors, maybe_load_preprocessors
I/O Contract
Inputs (maybe_save_preprocessors)
| Name | Type | Required | Description |
|---|---|---|---|
| src_name_or_path | Union[str, Path] | Yes | Model name on Hub or local path to load preprocessors from |
| dest_dir | Union[str, Path] | Yes | Directory to save preprocessors to |
| src_subfolder | str | No | Subfolder within the model directory (default: "") |
| trust_remote_code | bool | No | Allow running arbitrary code from preprocessors (default: False) |
Outputs (maybe_save_preprocessors)
| Name | Type | Description |
|---|---|---|
| (none) | None | Preprocessors are saved as files in dest_dir |
Outputs (maybe_load_preprocessors)
| Name | Type | Description |
|---|---|---|
| preprocessors | List | List of successfully loaded preprocessor instances |
Usage Examples
Saving Preprocessors During Export
from optimum.utils.save_utils import maybe_save_preprocessors
# Save all preprocessors from a Hub model to a local directory
maybe_save_preprocessors(
src_name_or_path="bert-base-uncased",
dest_dir="./exported_model",
)
# The ./exported_model directory now contains tokenizer files
Loading Available Preprocessors
from optimum.utils.save_utils import maybe_load_preprocessors
preprocessors = maybe_load_preprocessors("bert-base-uncased")
for p in preprocessors:
print(type(p).__name__)
# BertTokenizerFast
# BertTokenizerFast (from AutoProcessor)
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment