Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer ImageDiffusionMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for generating new images using Stable Diffusion for data augmentation provided by Data-Juicer.

Description

ImageDiffusionMapper is a mapper operator that generates new images using a HuggingFace diffusion model (default: Stable Diffusion v1.4) based on existing images and their captions. It supports image-to-image transformation with configurable strength (how much to deviate from reference image), guidance scale (text-prompt adherence), and the number of augmented images per sample. If no caption is provided, it can generate one using a BLIP2 model. Operates in batched mode with CUDA acceleration and requires approximately 8GB of GPU memory.

Usage

Use when you need to expand training datasets with realistic new images via diffusion-based generation, maintaining semantic consistency with captions while adding visual diversity.

Code Reference

Source Location

Signature

@OPERATORS.register_module("image_diffusion_mapper")
class ImageDiffusionMapper(Mapper):
    def __init__(self,
                 hf_diffusion: str = "CompVis/stable-diffusion-v1-4",
                 trust_remote_code: bool = False,
                 torch_dtype: str = "fp32",
                 revision: str = "main",
                 strength: float = 0.8,
                 guidance_scale: float = 7.5,
                 aug_num: PositiveInt = 1,
                 keep_original_sample: bool = True,
                 caption_key: Optional[str] = None,
                 hf_img2seq: str = "Salesforce/blip2-opt-2.7b",
                 save_dir: str = None,
                 *args, **kwargs):

Import

from data_juicer.ops.mapper.image_diffusion_mapper import ImageDiffusionMapper

I/O Contract

Inputs

Name Type Required Description
hf_diffusion str No Diffusion model name on HuggingFace, defaults to "CompVis/stable-diffusion-v1-4"
trust_remote_code bool No Whether to trust remote code of HF models, defaults to False
torch_dtype str No Floating point type for model: fp32, fp16, or bf16; defaults to "fp32"
revision str No Specific model version (branch, tag, or commit id), defaults to "main"
strength float No Extent to transform reference image (0 to 1), defaults to 0.8
guidance_scale float No How closely generated images match text prompt, defaults to 7.5
aug_num PositiveInt No Number of augmented images per sample, defaults to 1
keep_original_sample bool No Whether to keep original sample, defaults to True
caption_key Optional[str] No Key name in samples for captions; if None, captions are auto-generated
hf_img2seq str No HuggingFace model for caption generation, defaults to "Salesforce/blip2-opt-2.7b"
save_dir str No Directory to store generated images; if not specified, saves in same directory as input

Outputs

Name Type Description
samples Dict Transformed samples with generated image paths and augmented entries

Usage Examples

process:
  - image_diffusion_mapper:
      hf_diffusion: "CompVis/stable-diffusion-v1-4"
      strength: 0.8
      guidance_scale: 7.5
      aug_num: 1
      keep_original_sample: true

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment