Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Huggingface Optimum ExporterConfig Generate Dummy Inputs

From Leeroopedia
Field Value
Page Type Implementation
Source Repository https://github.com/huggingface/optimum
Source Files optimum/exporters/base.py, optimum/utils/input_generators.py
Domains NLP, Computer_Vision, Export
Last Updated 2026-02-15 00:00 GMT

Overview

This implementation provides the concrete APIs for generating synthetic model inputs for export tracing and validation. It consists of the ExporterConfig.generate_dummy_inputs orchestration method and the DummyInputGenerator hierarchy of modality-specific input generators.

API Reference

ExporterConfig.generate_dummy_inputs

Source location: optimum/exporters/base.py, lines 224-241

Purpose: Generates a complete dictionary of dummy inputs for the model by delegating to the appropriate DummyInputGenerator subclasses registered in the config's DUMMY_INPUT_GENERATOR_CLASSES tuple.

Signature:

def generate_dummy_inputs(self, framework: str = "pt", **kwargs) -> Dict:

Parameters:

Parameter Type Default Description
framework str "pt" The framework for which to create dummy inputs ("pt" for PyTorch).
**kwargs Override default shapes (e.g., batch_size=4, sequence_length=32).

Returns: Dict[str, Union[torch.Tensor, np.ndarray]] -- A dictionary mapping input names to dummy tensors.

Raises: RuntimeError if no generator can produce a required input.

Internal logic:

def generate_dummy_inputs(self, framework: str = "pt", **kwargs) -> Dict:
    dummy_inputs_generators = self._create_dummy_input_generator_classes(**kwargs)
    dummy_inputs = {}
    for input_name in self.inputs:
        input_was_inserted = False
        for dummy_input_gen in dummy_inputs_generators:
            if dummy_input_gen.supports_input(input_name):
                dummy_inputs[input_name] = dummy_input_gen.generate(
                    input_name, framework=framework,
                    int_dtype=self.int_dtype, float_dtype=self.float_dtype
                )
                input_was_inserted = True
                break
        if not input_was_inserted:
            raise RuntimeError(
                f'Could not generate dummy input for "{input_name}".'
            )
    return dummy_inputs

The method:

  1. Instantiates all generator classes listed in self.DUMMY_INPUT_GENERATOR_CLASSES
  2. Iterates over each input name declared in self.inputs
  3. For each input, finds the first generator that supports it (via supports_input)
  4. Calls generate on that generator to produce the tensor

DummyInputGenerator (Base Class)

Source location: optimum/utils/input_generators.py, lines 93-132

Purpose: Abstract base class defining the interface for all dummy input generators.

Signature:

class DummyInputGenerator(ABC):
    SUPPORTED_INPUT_NAMES = ()

    def supports_input(self, input_name: str) -> bool:
        ...

    @abstractmethod
    def generate(
        self,
        input_name: str,
        framework: str = "pt",
        int_dtype: str = "int64",
        float_dtype: str = "fp32",
    ) -> Any:
        ...

Key methods:

Method Description
supports_input(input_name) Returns True if input_name starts with any of SUPPORTED_INPUT_NAMES
generate(input_name, framework, int_dtype, float_dtype) Abstract method; produces a concrete tensor for the given input name
random_int_tensor(shape, max_value, ...) Static helper to generate random integer tensors
random_mask_tensor(shape, padding_side, ...) Static helper to generate attention mask tensors (with left or right padding)
random_float_tensor(shape, ...) Static helper to generate random float tensors

Modality-Specific Subclasses

The following table lists the primary DummyInputGenerator subclasses defined in optimum/utils/input_generators.py:

Class Line Supported Inputs
DummyTextInputGenerator 363 input_ids, attention_mask, token_type_ids, position_ids
DummyXPathSeqInputGenerator 467 XPath-based inputs for MarkupLM
DummyDecoderTextInputGenerator 519 decoder_input_ids, decoder_attention_mask
DummyDecisionTransformerInputGenerator 530 Decision transformer states, actions, rewards, returns
DummySeq2SeqDecoderTextInputGenerator 567 Seq2seq decoder inputs including encoder_outputs
DummyBboxInputGenerator 755 bbox coordinates for document AI
DummyVisionInputGenerator 795 pixel_values, pixel_mask
DummyAudioInputGenerator 883 input_features, input_values
DummyTimestepInputGenerator 926 timestep for diffusion models
DummyPix2StructInputGenerator 1077 Pix2Struct flattened patches
DummySpeechT5InputGenerator 1394 SpeechT5 spectrogram and speaker embeddings
DummyCodegenDecoderTextInputGenerator 1504 CodeGen decoder inputs
DummyEncodecInputGenerator 1539 EnCodec audio codec inputs
DummyTransformerTimestepInputGenerator 1597 Transformer-based diffusion timesteps
DummyTransformerVisionInputGenerator 1608 Vision inputs for diffusion transformers
DummyTransformerTextInputGenerator 1612 Text inputs for diffusion transformers
DummyFluxTransformerVisionInputGenerator 1630 Flux model vision inputs
DummyFluxTransformerTextInputGenerator 1651 Flux model text inputs
DummyPatchTSTInputGenerator 1674 PatchTST time-series inputs
DummyVisionStaticInputGenerator 1741 Static-shape vision inputs

Import

from optimum.exporters.base import ExporterConfig
from optimum.utils.input_generators import (
    DummyInputGenerator,
    DummyTextInputGenerator,
    DummyVisionInputGenerator,
    DummyAudioInputGenerator,
)

Input/Output Summary

API Input Output
ExporterConfig.generate_dummy_inputs Framework string + optional shape overrides Dict[str, Tensor] mapping input names to tensors
DummyInputGenerator.generate Input name, framework, dtype settings Single tensor for the specified input

Usage Example

from optimum.exporters import TasksManager
from transformers import AutoConfig

# Get export config for BERT text-classification
config_constructor = TasksManager.get_exporter_config_constructor(
    exporter="onnx",
    model_type="bert",
    task="text-classification",
    library_name="transformers",
)
model_config = AutoConfig.from_pretrained("bert-base-uncased")
export_config = config_constructor(model_config)

# Generate dummy inputs
dummy_inputs = export_config.generate_dummy_inputs(framework="pt")
# Returns: {
#     "input_ids": tensor of shape [2, 16],
#     "attention_mask": tensor of shape [2, 16],
#     "token_type_ids": tensor of shape [2, 16],
# }

# Generate with custom shapes
dummy_inputs = export_config.generate_dummy_inputs(
    framework="pt",
    batch_size=4,
    sequence_length=128,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment