Implementation:Huggingface Optimum ExporterConfig Generate Dummy Inputs
| Field | Value |
|---|---|
| Page Type | Implementation |
| Source Repository | https://github.com/huggingface/optimum |
| Source Files | optimum/exporters/base.py, optimum/utils/input_generators.py
|
| Domains | NLP, Computer_Vision, Export |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
This implementation provides the concrete APIs for generating synthetic model inputs for export tracing and validation. It consists of the ExporterConfig.generate_dummy_inputs orchestration method and the DummyInputGenerator hierarchy of modality-specific input generators.
API Reference
ExporterConfig.generate_dummy_inputs
Source location: optimum/exporters/base.py, lines 224-241
Purpose: Generates a complete dictionary of dummy inputs for the model by delegating to the appropriate DummyInputGenerator subclasses registered in the config's DUMMY_INPUT_GENERATOR_CLASSES tuple.
Signature:
def generate_dummy_inputs(self, framework: str = "pt", **kwargs) -> Dict:
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
framework |
str |
"pt" |
The framework for which to create dummy inputs ("pt" for PyTorch).
|
**kwargs |
Override default shapes (e.g., batch_size=4, sequence_length=32).
|
Returns: Dict[str, Union[torch.Tensor, np.ndarray]] -- A dictionary mapping input names to dummy tensors.
Raises: RuntimeError if no generator can produce a required input.
Internal logic:
def generate_dummy_inputs(self, framework: str = "pt", **kwargs) -> Dict:
dummy_inputs_generators = self._create_dummy_input_generator_classes(**kwargs)
dummy_inputs = {}
for input_name in self.inputs:
input_was_inserted = False
for dummy_input_gen in dummy_inputs_generators:
if dummy_input_gen.supports_input(input_name):
dummy_inputs[input_name] = dummy_input_gen.generate(
input_name, framework=framework,
int_dtype=self.int_dtype, float_dtype=self.float_dtype
)
input_was_inserted = True
break
if not input_was_inserted:
raise RuntimeError(
f'Could not generate dummy input for "{input_name}".'
)
return dummy_inputs
The method:
- Instantiates all generator classes listed in
self.DUMMY_INPUT_GENERATOR_CLASSES - Iterates over each input name declared in
self.inputs - For each input, finds the first generator that supports it (via
supports_input) - Calls
generateon that generator to produce the tensor
DummyInputGenerator (Base Class)
Source location: optimum/utils/input_generators.py, lines 93-132
Purpose: Abstract base class defining the interface for all dummy input generators.
Signature:
class DummyInputGenerator(ABC):
SUPPORTED_INPUT_NAMES = ()
def supports_input(self, input_name: str) -> bool:
...
@abstractmethod
def generate(
self,
input_name: str,
framework: str = "pt",
int_dtype: str = "int64",
float_dtype: str = "fp32",
) -> Any:
...
Key methods:
| Method | Description |
|---|---|
supports_input(input_name) |
Returns True if input_name starts with any of SUPPORTED_INPUT_NAMES
|
generate(input_name, framework, int_dtype, float_dtype) |
Abstract method; produces a concrete tensor for the given input name |
random_int_tensor(shape, max_value, ...) |
Static helper to generate random integer tensors |
random_mask_tensor(shape, padding_side, ...) |
Static helper to generate attention mask tensors (with left or right padding) |
random_float_tensor(shape, ...) |
Static helper to generate random float tensors |
Modality-Specific Subclasses
The following table lists the primary DummyInputGenerator subclasses defined in optimum/utils/input_generators.py:
| Class | Line | Supported Inputs |
|---|---|---|
DummyTextInputGenerator |
363 | input_ids, attention_mask, token_type_ids, position_ids
|
DummyXPathSeqInputGenerator |
467 | XPath-based inputs for MarkupLM |
DummyDecoderTextInputGenerator |
519 | decoder_input_ids, decoder_attention_mask
|
DummyDecisionTransformerInputGenerator |
530 | Decision transformer states, actions, rewards, returns |
DummySeq2SeqDecoderTextInputGenerator |
567 | Seq2seq decoder inputs including encoder_outputs
|
DummyBboxInputGenerator |
755 | bbox coordinates for document AI
|
DummyVisionInputGenerator |
795 | pixel_values, pixel_mask
|
DummyAudioInputGenerator |
883 | input_features, input_values
|
DummyTimestepInputGenerator |
926 | timestep for diffusion models
|
DummyPix2StructInputGenerator |
1077 | Pix2Struct flattened patches |
DummySpeechT5InputGenerator |
1394 | SpeechT5 spectrogram and speaker embeddings |
DummyCodegenDecoderTextInputGenerator |
1504 | CodeGen decoder inputs |
DummyEncodecInputGenerator |
1539 | EnCodec audio codec inputs |
DummyTransformerTimestepInputGenerator |
1597 | Transformer-based diffusion timesteps |
DummyTransformerVisionInputGenerator |
1608 | Vision inputs for diffusion transformers |
DummyTransformerTextInputGenerator |
1612 | Text inputs for diffusion transformers |
DummyFluxTransformerVisionInputGenerator |
1630 | Flux model vision inputs |
DummyFluxTransformerTextInputGenerator |
1651 | Flux model text inputs |
DummyPatchTSTInputGenerator |
1674 | PatchTST time-series inputs |
DummyVisionStaticInputGenerator |
1741 | Static-shape vision inputs |
Import
from optimum.exporters.base import ExporterConfig
from optimum.utils.input_generators import (
DummyInputGenerator,
DummyTextInputGenerator,
DummyVisionInputGenerator,
DummyAudioInputGenerator,
)
Input/Output Summary
| API | Input | Output |
|---|---|---|
ExporterConfig.generate_dummy_inputs |
Framework string + optional shape overrides | Dict[str, Tensor] mapping input names to tensors
|
DummyInputGenerator.generate |
Input name, framework, dtype settings | Single tensor for the specified input |
Usage Example
from optimum.exporters import TasksManager
from transformers import AutoConfig
# Get export config for BERT text-classification
config_constructor = TasksManager.get_exporter_config_constructor(
exporter="onnx",
model_type="bert",
task="text-classification",
library_name="transformers",
)
model_config = AutoConfig.from_pretrained("bert-base-uncased")
export_config = config_constructor(model_config)
# Generate dummy inputs
dummy_inputs = export_config.generate_dummy_inputs(framework="pt")
# Returns: {
# "input_ids": tensor of shape [2, 16],
# "attention_mask": tensor of shape [2, 16],
# "token_type_ids": tensor of shape [2, 16],
# }
# Generate with custom shapes
dummy_inputs = export_config.generate_dummy_inputs(
framework="pt",
batch_size=4,
sequence_length=128,
)