Implementation:Datajuicer Data juicer CalibrateResponseMapper

Knowledge Sources	Datajuicer_Data_juicer
Domains	Data_Processing, Mapping
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for calibrating the response (answer) in question-answer pairs based on reference text provided by Data-Juicer.

Description

CalibrateResponseMapper is a mapper operator that calibrates only the response portion of a QA pair using a reference text and an API-based language model. It extends CalibrateQAMapper with a specialized Chinese system prompt that instructs the model to refine only the answer, making it more detailed and accurate while ensuring it still answers the original question. The parse_output method returns None for the query and the stripped raw output as the calibrated response, leaving the original question unchanged. It extends the Mapper base class (via CalibrateQAMapper).

Usage

Import when you need to enhance answers in QA datasets while preserving the original questions.

Code Reference

Source Location

Repository: Datajuicer_Data_juicer
File: data_juicer/ops/mapper/calibrate_response_mapper.py

Signature

@OPERATORS.register_module("calibrate_response_mapper")
class CalibrateResponseMapper(CalibrateQAMapper):
    def __init__(self,
                 api_model: str = "gpt-4o",
                 *,
                 api_endpoint: Optional[str] = None,
                 response_path: Optional[str] = None,
                 system_prompt: Optional[str] = None,
                 input_template: Optional[str] = None,
                 reference_template: Optional[str] = None,
                 qa_pair_template: Optional[str] = None,
                 output_pattern: Optional[str] = None,
                 try_num: PositiveInt = 3,
                 model_params: Dict = {},
                 sampling_params: Dict = {},
                 **kwargs):

Import

from data_juicer.ops.mapper.calibrate_response_mapper import CalibrateResponseMapper

I/O Contract

Inputs

Name	Type	Required	Description
api_model	str	No	API model name. Default: "gpt-4o"
api_endpoint	Optional[str]	No	URL endpoint for the API
response_path	Optional[str]	No	Path to extract content from the API response. Defaults to 'choices.0.message.content'
system_prompt	Optional[str]	No	System prompt for the calibration task
input_template	Optional[str]	No	Template for building the model input
reference_template	Optional[str]	No	Template for formatting the reference text
qa_pair_template	Optional[str]	No	Template for formatting question-answer pairs
output_pattern	Optional[str]	No	Regular expression for parsing model output
try_num	PositiveInt	No	Number of retry attempts on API call or parsing error. Default: 3
model_params	Dict	No	Parameters for initializing the API model
sampling_params	Dict	No	Extra parameters passed to the API call (e.g. temperature, top_p)

Outputs

Name	Type	Description
samples	Dict	Transformed samples with calibrated response field updated

Usage Examples

YAML Configuration

process:
  - calibrate_response_mapper:
      api_model: gpt-4o
      try_num: 3

Related Pages

Environment:Datajuicer_Data_juicer_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment