Implementation:OpenRLHF OpenRLHF Rejection sampling processor

Knowledge Sources	OpenRLHF
Domains	Alignment, Data_Processing
Last Updated	2026-02-07 00:00 GMT

Overview

Concrete tool for selecting best-of-N responses via rejection sampling provided by OpenRLHF.

Description

The rejection_sampling_processor function takes a list of scored generation objects (input, output, reward), groups them by input prompt, and keeps only the highest-reward response for each prompt. The result is a filtered SFT-compatible dataset.

Usage

Called after batch vLLM generation and batch reward model inference. The output is used to create a new SFT dataset for retraining.

Code Reference

Source Location

Repository: OpenRLHF
File: openrlhf/utils/processor.py
Lines: L40-53

Signature

def rejection_sampling_processor(args, objs):
    """
    Select best response per prompt by reward score.

    Args:
        args: CLI arguments (unused in this processor)
        objs: List of dicts with keys: "input", "output", "reward"

    Returns:
        List of dicts: [{"input": str, "output": str, "reward": float}]
            One entry per unique prompt with the highest-reward response.
    """

Import

from openrlhf.utils.processor import rejection_sampling_processor
# or
from openrlhf.utils.processor import get_processor
processor = get_processor("rs")

I/O Contract

Inputs

Name	Type	Required	Description
args	Namespace	Yes	CLI arguments
objs	List[Dict]	Yes	Scored generations: [{input, output, reward}, ...]

Outputs

Name	Type	Description
filtered	List[Dict]	Best response per prompt: [{input, output, reward}, ...]

Usage Examples

from openrlhf.utils.processor import get_processor

processor = get_processor("rs")
filtered_data = processor(args, scored_generations)
# filtered_data contains one best response per unique prompt

Related Pages

Implements Principle

Principle:OpenRLHF_OpenRLHF_Rejection_Sampling

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment