Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer Difference Area Generator Mapper

From Leeroopedia
Knowledge Sources
Domains Image Comparison, Object Detection, Change Detection
Last Updated 2026-02-14 16:00 GMT

Overview

Identifies and localizes regions of difference between two similar images by comparing their captions and bounding box contents, producing annotated bounding boxes highlighting the differing areas.

Description

Difference_Area_Generator_Mapper is the first stage of the ImgDiff pipeline. It processes image pairs to identify and filter regions with significant visual differences through an 8-step process:

  1. Similarity Filtering -- Filters out image pairs with large differences using an image pair similarity filter (CLIP-based)
  2. Caption Comparison -- Compares the two captions using difflib and NLTK lemmatization to identify differing nouns (via the compare_text_index helper)
  3. Image Segmentation -- Segments both images using FastSAM to identify potential object regions
  4. Cropping -- Crops sub-images from both images based on the segmentation bounding boxes
  5. Object Validation -- Uses BLIP image-text matching to determine if cropped sub-images contain the identified "valid objects"
  6. Difference Detection -- Applies a second round of similarity filtering on cropped region pairs to detect actual visual differences
  7. NMS Filtering -- Removes overlapping bounding boxes using IoU-based non-maximum suppression (via the iou_filter helper)
  8. Cache Cleanup -- Removes all temporary cropped images from the cache directory

The operator uses three fused sub-operators:

  • image_pair_similarity_filter (CLIP-based)
  • image_segment_mapper (FastSAM-based)
  • image_text_matching_filter (BLIP-based)

Helper functions include is_noun (POS tag check), compare_text_index (caption diff with lemmatization), and iou_filter (NMS-style bounding box deduplication).

Requires CUDA acceleration and caches intermediate results in DATA_JUICER_ASSETS_CACHE.

Usage

Use this operator as the first stage of the ImgDiff pipeline to identify bounding box regions that differ between two similar images. It is typically followed by Difference_Caption_Generator_Mapper to generate textual descriptions of the detected differences.

Code Reference

Source Location

  • Repository: Datajuicer_Data_juicer
  • File: data_juicer/ops/mapper/imgdiff_difference_area_generator_mapper.py
  • Lines: 1-436

Signature

class Difference_Area_Generator_Mapper(Mapper):
    _accelerator = "cuda"

    def __init__(
        self,
        image_pair_similarity_filter_args: Optional[Dict] = {},
        image_segment_mapper_args: Optional[Dict] = {},
        image_text_matching_filter_args: Optional[Dict] = {},
        *args, **kwargs,
    ):

Import

from data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper import Difference_Area_Generator_Mapper

I/O Contract

Inputs

Name Type Required Description
image_pair_similarity_filter_args Dict No Arguments for image pair similarity filter. Default: min/max_score_1/2, hf_clip="openai/clip-vit-base-patch32"
image_segment_mapper_args Dict No Arguments for image segmentation. Default: imgsz=1024, conf=0.05, iou=0.5, model_path="FastSAM-x.pt"
image_text_matching_filter_args Dict No Arguments for image-text matching. Default: min_score=0.1, max_score=1.0, hf_blip="Salesforce/blip-itm-base-coco"

Sample Fields

Name Type Required Description
image_path1 str Yes Path to the first image
image_path2 str Yes Path to the second image
caption1 str Yes Caption for the first image
caption2 str Yes Caption for the second image

Outputs

Name Type Description
sample[Fields.meta][MetaKeys.bbox_tag] np.ndarray Filtered bounding boxes (Nx4) for regions with detected differences. Returns zeros if no differences found.

Usage Examples

# Basic usage
mapper = Difference_Area_Generator_Mapper()

# With custom similarity thresholds
mapper = Difference_Area_Generator_Mapper(
    image_pair_similarity_filter_args={
        "min_score_1": 0.2,
        "max_score_1": 0.9,
        "min_score_2": 0.1,
        "max_score_2": 0.8,
    },
    image_segment_mapper_args={
        "imgsz": 512,
        "conf": 0.1,
    },
)

# Process a sample
sample = {
    "image_path1": "/path/to/image1.jpg",
    "image_path2": "/path/to/image2.jpg",
    "caption1": "A red car parked on the street",
    "caption2": "A blue car parked on the street",
}
result = mapper.process_single(sample, rank=0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment