Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer SDXLPrompt2PromptMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for generating paired images using SDXL diffusion models provided by Data-Juicer.

Description

SDXLPrompt2PromptMapper is a mapper operator that generates pairs of similar images from two text prompts using the Stable Diffusion XL (SDXL) model with a Prompt2Prompt pipeline. It takes two text prompts from text_key and text_key_second, and generates corresponding paired images controlled by num_inference_steps and guidance_scale parameters. Generated images are saved with unique timestamped filenames to a configurable output directory. Requires CUDA acceleration and both text keys to be set for processing.

Usage

Use when you need to generate paired image data for training image editing and style transfer models using diffusion-based generation within the data pipeline.

Code Reference

Source Location

Signature

@OPERATORS.register_module("sdxl_prompt2prompt_mapper")
class SDXLPrompt2PromptMapper(Mapper):
    def __init__(
        self,
        hf_diffusion: str = "stabilityai/stable-diffusion-xl-base-1.0",
        trust_remote_code=False,
        torch_dtype: str = "fp32",
        num_inference_steps: float = 50,
        guidance_scale: float = 7.5,
        text_key=None,
        text_key_second=None,
        output_dir=DATA_JUICER_ASSETS_CACHE,
        *args,
        **kwargs,
    ):

Import

from data_juicer.ops.mapper.sdxl_prompt2prompt_mapper import SDXLPrompt2PromptMapper

I/O Contract

Inputs

Name Type Required Description
hf_diffusion str No Diffusion model name on HuggingFace (default: stabilityai/stable-diffusion-xl-base-1.0)
trust_remote_code bool No Whether to trust remote code of HF models (default: False)
torch_dtype str No Floating point type for loading the model (default: fp32)
num_inference_steps float No Number of inference steps; higher values improve quality (default: 50)
guidance_scale float No Guidance scale for text-image alignment (default: 7.5)
text_key str Yes Key name for the first caption in the pair
text_key_second str Yes Key name for the second caption in the pair
output_dir str No Storage location for generated images (default: DATA_JUICER_ASSETS_CACHE)

Outputs

Name Type Description
sample[image_path1] str Absolute path to the first generated image
sample[image_path2] str Absolute path to the second generated image

Usage Examples

process:
  - sdxl_prompt2prompt_mapper:
      hf_diffusion: 'stabilityai/stable-diffusion-xl-base-1.0'
      num_inference_steps: 50
      guidance_scale: 7.5
      text_key: 'caption1'
      text_key_second: 'caption2'
      output_dir: '/path/to/output'

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment