Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer ExtractNicknameMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for extracting nickname relationships between characters in text provided by Data-Juicer.

Description

ExtractNicknameMapper is a mapper operator that identifies and extracts nickname relationships from input text using an API-based language model. It sends text to the model with a Chinese system prompt instructing it to identify speaker, addressee, and nickname triples. The structured markdown response is parsed via a verbose regex pattern with cross-validation checks. Results are stored in metadata under the configured nickname key as relationship records containing source entity, target entity, relation description, and relation keywords.

Usage

Use when you need to build character relationship graphs from narrative text, extracting interpersonal nickname and address-form data to understand social dynamics in story-based datasets.

Code Reference

Source Location

Signature

@OPERATORS.register_module("extract_nickname_mapper")
class ExtractNicknameMapper(Mapper):
    def __init__(self,
                 api_model: str = "gpt-4o",
                 *,
                 nickname_key: str = MetaKeys.nickname,
                 api_endpoint: Optional[str] = None,
                 response_path: Optional[str] = None,
                 system_prompt: Optional[str] = None,
                 input_template: Optional[str] = None,
                 output_pattern: Optional[str] = None,
                 try_num: PositiveInt = 3,
                 drop_text: bool = False,
                 model_params: Dict = {},
                 sampling_params: Dict = {},
                 **kwargs):

Import

from data_juicer.ops.mapper.extract_nickname_mapper import ExtractNicknameMapper

I/O Contract

Inputs

Name Type Required Description
api_model str No API model name, defaults to "gpt-4o"
nickname_key str No Key name to store nickname relationships in meta field, defaults to MetaKeys.nickname
api_endpoint Optional[str] No URL endpoint for the API
response_path Optional[str] No Path to extract content from API response
system_prompt Optional[str] No System prompt for the task
input_template Optional[str] No Template for building the model input
output_pattern Optional[str] No Regular expression for parsing model output
try_num PositiveInt No Number of retry attempts on error, defaults to 3
drop_text bool No Whether to drop text from output, defaults to False
model_params Dict No Parameters for initializing the API model
sampling_params Dict No Extra parameters passed to API call

Outputs

Name Type Description
samples Dict Transformed samples with nickname relationships stored in meta field

Usage Examples

process:
  - extract_nickname_mapper:
      api_model: "gpt-4o"
      try_num: 3
      drop_text: false

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment