Implementation:Datajuicer Data juicer DetectMainCharacterMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for identifying and extracting main character names from images and their captions provided by Data-Juicer.
Description
DetectMainCharacterMapper is a mapper operator that uses a multimodal language model to identify main characters in an image based on both the image and its text description. It constructs a detailed prompt combining the image and caption, sends it to a multimodal LLM (default: LLaVA), and parses the JSON response to extract a list of main characters (people, animals, key objects) with their distinct characteristics. Samples with fewer characters than a configurable minimum threshold can be filtered out. Results are stored in the sample's metadata under main_character_list. Requires CUDA acceleration. It extends the Mapper base class.
Usage
Import when you need to identify main characters in images as the entry point for character analysis pipelines.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/detect_main_character_mapper.py
Signature
@OPERATORS.register_module("detect_main_character_mapper")
class DetectMainCharacterMapper(Mapper):
def __init__(self,
mllm_mapper_args: Optional[Dict] = {},
filter_min_character_num: int = 0,
*args, **kwargs):
Import
from data_juicer.ops.mapper.detect_main_character_mapper import DetectMainCharacterMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| mllm_mapper_args | Optional[Dict] | No | Arguments for the multimodal language model mapper controlling caption generation. Default empty dict uses fixed values (max_new_tokens=256, temperature=0.2, hf_model=llava-hf/llava-v1.6-vicuna-7b-hf) |
| filter_min_character_num | int | No | Minimum number of main characters required; samples below this threshold are filtered out. Default: 0 |
Outputs
| Name | Type | Description |
|---|---|---|
| samples | Dict | Transformed samples with main_character_list added to metadata containing character names and descriptions |
Usage Examples
YAML Configuration
process:
- detect_main_character_mapper:
filter_min_character_num: 1