Implementation:Hiyouga LLaMA Factory Feedback Processor
| Knowledge Sources | |
|---|---|
| Domains | Data Processing, Preference Learning |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
Dataset processor for KTO (Kahneman-Tversky Optimization) training that encodes examples into paired target and KL-reference sequences with desirable/undesirable preference tags.
Description
The FeedbackDatasetProcessor class extends DatasetProcessor to prepare training data for KTO-style preference learning. For each example, it determines whether the response is desirable or undesirable based on content presence, then encodes both the target sequence and a KL-reference sequence. The KL-reference is created by shifting responses by +1 across the batch to produce mismatched prompt-completion pairs. Prompt tokens are masked with IGNORE_INDEX in labels, and a boolean kto_tag tracks the preference direction per example.
Usage
Use this processor when preparing datasets for KTO training. It is selected automatically by the data loading pipeline when the training stage requires feedback-style preference data. The processor validates that each batch contains both desirable and undesirable examples, logging a warning if only one preference type is present.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/data/processor/feedback.py
- Lines: 1-129
Signature
class FeedbackDatasetProcessor(DatasetProcessor):
def _encode_data_example(
self,
prompt: list[dict[str, str]],
response: list[dict[str, str]],
kl_response: list[dict[str, str]],
system: Optional[str],
tools: Optional[str],
images: list["ImageInput"],
videos: list["VideoInput"],
audios: list["AudioInput"],
) -> tuple[list[int], list[int], list[int], list[int], bool]
def preprocess_dataset(self, examples: dict[str, list[Any]]) -> dict[str, list[Any]]
def print_data_example(self, example: dict[str, list[int]]) -> None
Import
from llamafactory.data.processor.feedback import FeedbackDatasetProcessor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| examples | dict[str, list[Any]] |
Yes | Batch of raw examples with keys _prompt, _response, _system, _tools, _images, _videos, _audios |
| _prompt[i] | list[dict[str, str]] |
Yes | Conversation prompt messages (must have odd length) |
| _response[i] | list[dict[str, str]] |
Yes | Response pair: index 0 is the desired response, index 1 is the undesired response (must have at least 2 entries) |
Outputs
| Name | Type | Description |
|---|---|---|
| input_ids | list[list[int]] |
Tokenized target input sequences |
| attention_mask | list[list[int]] |
Attention masks for target sequences (all ones) |
| labels | list[list[int]] |
Target labels with prompt tokens masked as IGNORE_INDEX |
| kl_input_ids | list[list[int]] |
Tokenized KL-reference input sequences (mismatched pairs) |
| kl_attention_mask | list[list[int]] |
Attention masks for KL-reference sequences |
| kl_labels | list[list[int]] |
KL-reference labels with prompt tokens masked |
| kto_tags | list[bool] |
True for desirable examples, False for undesirable |
Usage Examples
from llamafactory.data.processor.feedback import FeedbackDatasetProcessor
# Instantiate with required dependencies
processor = FeedbackDatasetProcessor(
template=template,
tokenizer=tokenizer,
processor=None,
data_args=data_args,
)
# Preprocess a batch of examples
model_inputs = processor.preprocess_dataset(examples)
# model_inputs contains: input_ids, labels, kl_input_ids, kl_labels, kto_tags, etc.
# Debug: print a single example
processor.print_data_example(model_inputs[0])
Related Pages
- Hiyouga_LLaMA_Factory_Processor_Utils - Provides the DatasetProcessor base class and infer_seqlen utility used for sequence truncation
- Hiyouga_LLaMA_Factory_Pairwise_Processor - Alternative processor for DPO-style pairwise preference training
- Hiyouga_LLaMA_Factory_Supervised_Processor - Processor for standard supervised fine-tuning
- Hiyouga_LLaMA_Factory_Data_Args - DataArguments controlling cutoff_len and other processing parameters