Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Open r1 Compute Pass Rate

From Leeroopedia


Metadata

Field Value
Source Repo (https://github.com/huggingface/open-r1)
Domains NLP, Data_Engineering
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for computing and filtering dataset problems by model pass rate using vLLM batch generation and reward function scoring provided by Open-R1.

Description

The compute_pass_rate.py script implements the full difficulty filtering pipeline:

  1. Loads a dataset and formats prompts as chat messages via make_conversation and apply_chat_template.
  2. Initializes a vLLM LLM engine for efficient batch generation.
  3. Generates N completions per prompt using SamplingParams.
  4. Scores each completion using get_reward_funcs (same functions used during GRPO training).
  5. Computes mean pass rate per problem using torch.nanmean.
  6. Filters problems to keep those within [pass_rate_min, pass_rate_max].
  7. Pushes both datasets to HuggingFace Hub -- the full generated dataset and the filtered subset.

The script supports dataset sharding via dataset_start_index/dataset_end_index for Slurm parallelization across ~88 jobs.

Usage

Run as a standalone script before GRPO training to filter the training dataset by difficulty.

Code Reference

Source

Field Value
Repository open-r1
File scripts/pass_rate_filtering/compute_pass_rate.py
Lines L37-205

Signature

@dataclass
class PassRateScriptArguments(GRPOScriptArguments):
    output_dataset_name: Optional[str] = None
    pass_rate_min: float = 0.1
    pass_rate_max: float = 0.9
    dataset_start_index: Optional[int] = None
    dataset_end_index: Optional[int] = None
    dataset_split: str = "train"

Import

Run as script:

python scripts/pass_rate_filtering/compute_pass_rate.py --config recipes/dataset_filtering/config_demo.yaml

I/O Contract

Inputs

Parameter Type Required Description
HF dataset Dataset (with prompt column) Yes The training dataset to filter, loaded from HuggingFace Hub
Model ModelConfig Yes The model used for generating completions (via vLLM)
Reward functions GRPOScriptArguments Yes Same reward functions used during GRPO training, resolved via get_reward_funcs
pass_rate_min float Yes Lower bound threshold for filtering (default: 0.1)
pass_rate_max float Yes Upper bound threshold for filtering (default: 0.9)

Outputs

Output Description
Full generations dataset All generated completions with rewards, pushed to Hub under revision "gen"
Filtered subset Problems where pass_rate_min < mean_reward < pass_rate_max, pushed to Hub under revision "pass_rate"

Usage Examples

# Example 1: Run with a YAML config file
python scripts/pass_rate_filtering/compute_pass_rate.py \
    --config recipes/dataset_filtering/config_demo.yaml

# Example 2: Run with explicit parameters
python scripts/pass_rate_filtering/compute_pass_rate.py \
    --dataset_name "HuggingFaceH4/aime-2024-prompts" \
    --model_name_or_path "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B" \
    --num_generations 16 \
    --pass_rate_min 0.1 \
    --pass_rate_max 0.9 \
    --output_dataset_name "my-org/aime-2024-filtered"

# Example 3: Run a shard for Slurm parallelization
python scripts/pass_rate_filtering/compute_pass_rate.py \
    --config recipes/dataset_filtering/config_demo.yaml \
    --dataset_start_index 0 \
    --dataset_end_index 1000

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment