Implementation:Datajuicer Data juicer Difference Area Generator Mapper

Knowledge Sources	Datajuicer_Data_juicer
Domains	Image Comparison, Object Detection, Change Detection
Last Updated	2026-02-14 16:00 GMT

Overview

Identifies and localizes regions of difference between two similar images by comparing their captions and bounding box contents, producing annotated bounding boxes highlighting the differing areas.

Description

Difference_Area_Generator_Mapper is the first stage of the ImgDiff pipeline. It processes image pairs to identify and filter regions with significant visual differences through an 8-step process:

Similarity Filtering -- Filters out image pairs with large differences using an image pair similarity filter (CLIP-based)
Caption Comparison -- Compares the two captions using difflib and NLTK lemmatization to identify differing nouns (via the compare_text_index helper)
Image Segmentation -- Segments both images using FastSAM to identify potential object regions
Cropping -- Crops sub-images from both images based on the segmentation bounding boxes
Object Validation -- Uses BLIP image-text matching to determine if cropped sub-images contain the identified "valid objects"
Difference Detection -- Applies a second round of similarity filtering on cropped region pairs to detect actual visual differences
NMS Filtering -- Removes overlapping bounding boxes using IoU-based non-maximum suppression (via the iou_filter helper)
Cache Cleanup -- Removes all temporary cropped images from the cache directory

The operator uses three fused sub-operators:

image_pair_similarity_filter (CLIP-based)
image_segment_mapper (FastSAM-based)
image_text_matching_filter (BLIP-based)

Helper functions include is_noun (POS tag check), compare_text_index (caption diff with lemmatization), and iou_filter (NMS-style bounding box deduplication).

Requires CUDA acceleration and caches intermediate results in DATA_JUICER_ASSETS_CACHE.

Usage

Use this operator as the first stage of the ImgDiff pipeline to identify bounding box regions that differ between two similar images. It is typically followed by Difference_Caption_Generator_Mapper to generate textual descriptions of the detected differences.

Code Reference

Source Location

Repository: Datajuicer_Data_juicer
File: data_juicer/ops/mapper/imgdiff_difference_area_generator_mapper.py
Lines: 1-436

Signature

class Difference_Area_Generator_Mapper(Mapper):
    _accelerator = "cuda"

    def __init__(
        self,
        image_pair_similarity_filter_args: Optional[Dict] = {},
        image_segment_mapper_args: Optional[Dict] = {},
        image_text_matching_filter_args: Optional[Dict] = {},
        *args, **kwargs,
    ):

Import

from data_juicer.ops.mapper.imgdiff_difference_area_generator_mapper import Difference_Area_Generator_Mapper

I/O Contract

Inputs

Name	Type	Required	Description
image_pair_similarity_filter_args	Dict	No	Arguments for image pair similarity filter. Default: min/max_score_1/2, hf_clip="openai/clip-vit-base-patch32"
image_segment_mapper_args	Dict	No	Arguments for image segmentation. Default: imgsz=1024, conf=0.05, iou=0.5, model_path="FastSAM-x.pt"
image_text_matching_filter_args	Dict	No	Arguments for image-text matching. Default: min_score=0.1, max_score=1.0, hf_blip="Salesforce/blip-itm-base-coco"

Sample Fields

Name	Type	Required	Description
image_path1	str	Yes	Path to the first image
image_path2	str	Yes	Path to the second image
caption1	str	Yes	Caption for the first image
caption2	str	Yes	Caption for the second image

Outputs

Name	Type	Description
sample[Fields.meta][MetaKeys.bbox_tag]	np.ndarray	Filtered bounding boxes (Nx4) for regions with detected differences. Returns zeros if no differences found.

Usage Examples

# Basic usage
mapper = Difference_Area_Generator_Mapper()

# With custom similarity thresholds
mapper = Difference_Area_Generator_Mapper(
    image_pair_similarity_filter_args={
        "min_score_1": 0.2,
        "max_score_1": 0.9,
        "min_score_2": 0.1,
        "max_score_2": 0.8,
    },
    image_segment_mapper_args={
        "imgsz": 512,
        "conf": 0.1,
    },
)

# Process a sample
sample = {
    "image_path1": "/path/to/image1.jpg",
    "image_path2": "/path/to/image2.jpg",
    "caption1": "A red car parked on the street",
    "caption2": "A blue car parked on the street",
}
result = mapper.process_single(sample, rank=0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment