Implementation:Axolotl ai cloud Axolotl Load Processor
| Knowledge Sources | |
|---|---|
| Domains | Multimodal, Vision_Language |
| Last Updated | 2026-02-06 23:00 GMT |
Overview
Concrete tool for loading multimodal processors for vision-language model training provided by the Axolotl framework.
Description
The load_processor function loads the appropriate processor for a vision-language model. It handles processor class selection (AutoProcessor or model-specific), Mistral-specific tokenizer patching, image size auto-detection from the processor, and trust_remote_code configuration. For Mistral models, it patches the tokenizer backend and uses a custom Mistral3Processor.
Usage
Called conditionally during dataset loading when cfg.processor_type is set or the model is detected as multimodal. The processor is passed through to data preparation and training.
Code Reference
Source Location
- Repository: axolotl
- File: src/axolotl/loaders/processor.py
- Lines: L17-82
Signature
def load_processor(
cfg: DictDefault,
tokenizer: PreTrainedTokenizerBase,
) -> ProcessorMixin:
"""Load a multimodal processor for vision-language models.
Args:
cfg: Config with processor_type, processor_config, trust_remote_code,
tokenizer_use_mistral_common, image_size.
tokenizer: Pre-loaded tokenizer instance.
Returns:
ProcessorMixin: Configured processor instance with image_size
potentially populated from processor metadata.
"""
Import
from axolotl.loaders.processor import load_processor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cfg | DictDefault | Yes | Config with processor_type (explicit class name), processor_config (HF model ID), trust_remote_code, tokenizer_use_mistral_common, image_size |
| tokenizer | PreTrainedTokenizerBase | Yes | Pre-loaded tokenizer passed to processor |
Outputs
| Name | Type | Description |
|---|---|---|
| return | ProcessorMixin | Configured processor (AutoProcessor, Mistral3Processor, or VoxtralProcessor) |
Usage Examples
Loading AutoProcessor
from axolotl.loaders.processor import load_processor
from axolotl.loaders.tokenizer import load_tokenizer
tokenizer = load_tokenizer(cfg)
# cfg.processor_type = "AutoProcessor"
# cfg.processor_config = "meta-llama/Llama-3.2-11B-Vision"
processor = load_processor(cfg, tokenizer)
# Image size auto-detected from processor
print(cfg.image_size) # (560, 560)