Implementation:Datajuicer Data juicer HumanPreferenceAnnotationMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for collecting human preference annotations via Label Studio provided by Data-Juicer.
Description
HumanPreferenceAnnotationMapper extends LabelStudioAnnotationMapper to implement a human preference annotation workflow. It presents pairs of answers to a prompt in Label Studio for human evaluation using a custom XML configuration that displays the prompt and two answer options side by side with styled UI. The operator formats samples with configurable prompt and answer keys, submits them as annotation tasks, and processes the completed annotations to determine which response was preferred, updating the sample with 'chosen' and 'rejected' answer fields. This enables RLHF-style (Reinforcement Learning from Human Feedback) data collection.
Usage
Use when you need to collect human preference judgments on response pairs for RLHF training data, comparing two candidate answers to the same prompt and recording which one a human annotator prefers.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/annotation/human_preference_annotation_mapper.py
Signature
@OPERATORS.register_module("human_preference_annotation_mapper")
class HumanPreferenceAnnotationMapper(LabelStudioAnnotationMapper):
def __init__(self, label_config_file: str = None,
answer1_key: str = "answer1",
answer2_key: str = "answer2",
prompt_key: str = "prompt",
chosen_key: str = "chosen",
rejected_key: str = "rejected",
**kwargs):
Import
from data_juicer.ops.mapper.annotation.human_preference_annotation_mapper import HumanPreferenceAnnotationMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| label_config_file | str | No | Path to a custom Label Studio label config file. Default: None (uses built-in config) |
| answer1_key | str | No | Key for the first answer in the sample. Default: "answer1" |
| answer2_key | str | No | Key for the second answer in the sample. Default: "answer2" |
| prompt_key | str | No | Key for the prompt/question in the sample. Default: "prompt" |
| chosen_key | str | No | Key to store the chosen answer. Default: "chosen" |
| rejected_key | str | No | Key to store the rejected answer. Default: "rejected" |
Outputs
| Name | Type | Description |
|---|---|---|
| sample[chosen_key] | str | The answer selected as preferred by the human annotator |
| sample[rejected_key] | str | The answer not selected by the human annotator |
Usage Examples
process:
- human_preference_annotation_mapper:
prompt_key: "question"
answer1_key: "response_a"
answer2_key: "response_b"
chosen_key: "preferred"
rejected_key: "not_preferred"