Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer DetectCharacterAttributesMapper

From Leeroopedia
Knowledge Sources
Domains Multimodal Processing, Character Analysis, Object Detection
Last Updated 2026-02-14 16:00 GMT

Overview

Extracts and classifies attributes of main characters in an image using a multi-model pipeline combining object detection, image-text matching, and language model inference.

Description

DetectCharacterAttributesMapper is an advanced multimodal analysis operator that builds rich character annotations from images. Given an image, a caption, and a list of main character names, it performs the following steps:

  1. Character Location -- Uses the DetectCharacterLocationsMapper (which internally uses YOLOE for detection and BLIP for image-text matching) to locate main characters in the image with bounding boxes
  2. Character Classification -- For each character, queries a LLaMA-based multimodal LLM (default: llava-v1.6-vicuna-7b) to classify it into one of five categories: object, animal, person, text, or other
  3. Feature Extraction -- Extracts characteristic phrases (color, material, action, etc.) from the caption text for each character using the same LLM
  4. Visual Verification -- Crops each character's bounding box region and verifies extracted features against the actual visual content using yes/no LLM queries
  5. Category-Specific Expansion -- Based on the character's class, generates additional class-specific features (e.g., clothing/age for persons, color/action for animals)

The final output includes bounding boxes and validated characteristic lists for each main character, stored in the main_character_attributes_list field under the sample's meta key.

Requires CUDA acceleration. Registered as a tagging operator and supports fused image loading.

Usage

Use this operator when building character-centric datasets that require detailed per-character attribute annotations. It is suitable for image-text datasets where fine-grained character understanding is needed for downstream tasks such as character-driven story generation or visual question answering.

Code Reference

Source Location

  • Repository: Datajuicer_Data_juicer
  • File: data_juicer/ops/mapper/detect_character_attributes_mapper.py
  • Lines: 1-313

Signature

class DetectCharacterAttributesMapper(Mapper):
    _accelerator = "cuda"

    def __init__(
        self,
        detect_character_locations_mapper_args: Optional[Dict] = {},
        *args, **kwargs,
    ):

Import

from data_juicer.ops.mapper.detect_character_attributes_mapper import DetectCharacterAttributesMapper

I/O Contract

Inputs

Name Type Required Description
detect_character_locations_mapper_args Dict No Arguments for the character location sub-operator. Controls thresholds for detection, matching, and model paths. Default uses: llava-v1.6-vicuna-7b-hf, blip-itm-base-coco, yoloe-11l-seg.pt, iou_threshold=0.7, matching_score_threshold=0.4

Sample Fields

Name Type Required Description
main_character_list list[str] Yes List of main character names to detect and analyze
images list[str] Yes List of image paths
text str Yes Caption text describing the image content

Outputs

Name Type Description
sample[Fields.meta]["main_character_attributes_list"] list[dict] List of dictionaries, each containing "main_character" (str), "bbox" (list), and "characteristic_list" (list[str])

Usage Examples

# Basic usage
mapper = DetectCharacterAttributesMapper()

# With custom detection parameters
mapper = DetectCharacterAttributesMapper(
    detect_character_locations_mapper_args={
        "iou_threshold": 0.5,
        "matching_score_threshold": 0.3,
        "yoloe_path": "yoloe-11l-seg.pt",
    }
)

# Process a sample
sample = {
    "main_character_list": ["boy", "dog"],
    "images": ["/path/to/image.jpg"],
    "text": "A boy in a blue shirt sitting on a fence with his golden retriever.",
}
result = mapper.process_single(sample, rank=0)
# result[Fields.meta]["main_character_attributes_list"] contains attributes

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment