Principle:Huggingface Optimum Dummy Input Generation
| Field | Value |
|---|---|
| Page Type | Principle |
| Source Repository | https://github.com/huggingface/optimum |
| Domains | NLP, Computer_Vision, Export |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Dummy Input Generation is the technique for generating synthetic model inputs with correct shapes and types to enable symbolic tracing and export validation. It produces concrete tensor inputs that allow the export system to trace the model's computation graph without requiring real data.
Description
Model export via ONNX or TFLite requires tracing the computation graph with concrete tensor inputs. This is because these export formats work by recording the operations performed on actual tensors, capturing the graph structure. Dummy input generators create properly-shaped tensors for each model architecture without needing real data.
The system generates inputs such as:
- Text inputs --
input_ids,attention_mask,token_type_ids(integer tensors with appropriate vocabulary range) - Vision inputs --
pixel_values(float tensors with correct height, width, and channel dimensions) - Audio inputs --
input_features,input_values(float tensors matching feature extraction output shapes) - Past key-values --
past_key_values(nested float tensors matching the model's hidden state dimensions) - Decoder inputs --
decoder_input_ids,encoder_outputsfor sequence-to-sequence models - Specialized inputs -- Bounding boxes, point coordinates, timesteps, and other architecture-specific inputs
The system uses a registry of architecture-specific generators that read shape information from the model's normalized config. Default shapes are defined in DEFAULT_DUMMY_SHAPES:
| Shape Parameter | Default Value | Description |
|---|---|---|
batch_size |
2 | Number of samples in the batch |
sequence_length |
16 | Length of text sequences |
num_choices |
4 | Number of choices for multiple-choice tasks |
width |
64 | Image width in pixels |
height |
64 | Image height in pixels |
num_channels |
3 | Number of image channels |
feature_size |
80 | Number of audio features (e.g., MEL bins) |
nb_max_frames |
3000 | Number of audio frames |
audio_sequence_length |
16000 | Raw audio sequence length |
Usage
Use Dummy Input Generation when exporting models to traced formats that require concrete input tensors for graph construction. It is invoked automatically during the export process via the ExporterConfig.generate_dummy_inputs method.
Typical scenarios:
- ONNX export via
torch.onnx.export, which requires example inputs for tracing - TFLite export, which requires concrete tensors for graph freezing
- Export validation, where dummy inputs are fed through both the original and exported models to compare outputs
Users can override default shapes by passing keyword arguments to generate_dummy_inputs (e.g., custom batch size, sequence length).
Theoretical Basis
Dummy Input Generation uses the Template Method Pattern with architecture-specific subclasses. The architecture consists of:
DummyInputGenerator(ABC) -- Defines the interface with two key methods:supports_input(input_name)-- Checks whether this generator can produce the named inputgenerate(input_name, framework, int_dtype, float_dtype)-- Produces a concrete tensor for the named input- Each subclass declares a
SUPPORTED_INPUT_NAMEStuple listing the input names it handles
- Modality-specific subclasses -- Each implements generation logic for its modality:
DummyTextInputGenerator-- Generatesinput_ids,attention_mask,token_type_idsDummyVisionInputGenerator-- Generatespixel_values,pixel_maskDummyAudioInputGenerator-- Generatesinput_features,input_valuesDummyTimestepInputGenerator-- Generatestimestepinputs for diffusion modelsDummyBboxInputGenerator-- Generates bounding box coordinates- And 20+ additional specialized subclasses
ExporterConfigorchestration -- TheDUMMY_INPUT_GENERATOR_CLASSEStuple on eachExporterConfigsubclass lists which generators to use. Thegenerate_dummy_inputsmethod iterates over declared input names, finds the first generator that supports each input, and callsgenerate.
This design ensures that:
- New input modalities can be added by creating new generator subclasses
- Each model architecture declares exactly which generators it needs
- Shape information is read from the model's normalized config, adapting to each model's dimensions
Related Pages
- Implemented by: Implementation:Huggingface_Optimum_ExporterConfig_Generate_Dummy_Inputs
- Heuristic:Huggingface_Optimum_Dummy_Input_Shape_Defaults
- Heuristic:Huggingface_Optimum_Version_Conditional_Behavior