Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Transformers Create Dummy Models

From Leeroopedia
Knowledge Sources
Domains Testing_Infrastructure, Model_Architecture
Last Updated 2026-02-13 20:00 GMT

Overview

Concrete tool for creating tiny (minimal-size) versions of all model architectures in the Transformers library for fast CI testing.

Description

The create_dummy_models.py utility (1477 lines) generates tiny-random-* models used across the CI test suite. For each model config class, it retrieves processor classes from auto mappings, builds processors from Hub checkpoints (with recursive fallback strategies), obtains tiny configs from model tester classes, converts tokenizers to reduced vocabulary (~1024 tokens) using train_new_from_iterator, shrinks image processor sizes to match tiny configs, then instantiates and saves each model architecture with the tiny config. Handles composite models (encoder-decoder, vision-text) specially. Supports multiprocessing, uploading to HuggingFace Hub, and generates detailed reports.

Usage

Run periodically to regenerate tiny models when new model architectures are added, or when model test infrastructure changes require updated tiny models.

Code Reference

Source Location

Signature

def get_tiny_config(config_class: type) -> PretrainedConfig:
    """Retrieve a minimal configuration from the model tester class."""

def convert_tokenizer(tokenizer, tiny_config) -> PreTrainedTokenizer:
    """Reduce tokenizer vocabulary to ~1024 tokens."""

def build_processor(config_class: type) -> Optional[ProcessorMixin]:
    """Build the appropriate processor for a model config class."""

def create_tiny_models(
    output_dir: str,
    all_model_classes: List[type] = None,
    upload: bool = False,
    organization: str = "hf-internal-testing",
    num_workers: int = 1,
) -> Dict:
    """Create tiny random models for all architectures."""

Import

python utils/create_dummy_models.py --output_dir tiny_models/ --upload

I/O Contract

Inputs

Name Type Required Description
--output_dir str Yes Directory to save tiny models
--upload flag No Upload to HuggingFace Hub
--organization str No Hub organization (default: hf-internal-testing)
--num_workers int No Number of parallel workers

Outputs

Name Type Description
Model directories Directories Tiny model files for each architecture
Report JSON Summary of created/failed models
Hub uploads Hub repos Uploaded tiny-random-* repos (if --upload)

Usage Examples

Creating Tiny Models

# Create tiny models locally
python utils/create_dummy_models.py --output_dir ./tiny_models/

# Create and upload to Hub
python utils/create_dummy_models.py --output_dir ./tiny_models/ \
    --upload --organization hf-internal-testing --num_workers 4

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment