Implementation:Datajuicer Data juicer CalibrateResponseMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for calibrating the response (answer) in question-answer pairs based on reference text provided by Data-Juicer.
Description
CalibrateResponseMapper is a mapper operator that calibrates only the response portion of a QA pair using a reference text and an API-based language model. It extends CalibrateQAMapper with a specialized Chinese system prompt that instructs the model to refine only the answer, making it more detailed and accurate while ensuring it still answers the original question. The parse_output method returns None for the query and the stripped raw output as the calibrated response, leaving the original question unchanged. It extends the Mapper base class (via CalibrateQAMapper).
Usage
Import when you need to enhance answers in QA datasets while preserving the original questions.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/calibrate_response_mapper.py
Signature
@OPERATORS.register_module("calibrate_response_mapper")
class CalibrateResponseMapper(CalibrateQAMapper):
def __init__(self,
api_model: str = "gpt-4o",
*,
api_endpoint: Optional[str] = None,
response_path: Optional[str] = None,
system_prompt: Optional[str] = None,
input_template: Optional[str] = None,
reference_template: Optional[str] = None,
qa_pair_template: Optional[str] = None,
output_pattern: Optional[str] = None,
try_num: PositiveInt = 3,
model_params: Dict = {},
sampling_params: Dict = {},
**kwargs):
Import
from data_juicer.ops.mapper.calibrate_response_mapper import CalibrateResponseMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| api_model | str | No | API model name. Default: "gpt-4o" |
| api_endpoint | Optional[str] | No | URL endpoint for the API |
| response_path | Optional[str] | No | Path to extract content from the API response. Defaults to 'choices.0.message.content' |
| system_prompt | Optional[str] | No | System prompt for the calibration task |
| input_template | Optional[str] | No | Template for building the model input |
| reference_template | Optional[str] | No | Template for formatting the reference text |
| qa_pair_template | Optional[str] | No | Template for formatting question-answer pairs |
| output_pattern | Optional[str] | No | Regular expression for parsing model output |
| try_num | PositiveInt | No | Number of retry attempts on API call or parsing error. Default: 3 |
| model_params | Dict | No | Parameters for initializing the API model |
| sampling_params | Dict | No | Extra parameters passed to the API call (e.g. temperature, top_p) |
Outputs
| Name | Type | Description |
|---|---|---|
| samples | Dict | Transformed samples with calibrated response field updated |
Usage Examples
YAML Configuration
process:
- calibrate_response_mapper:
api_model: gpt-4o
try_num: 3