Implementation:Allenai Open instruct Build All Verifiers
| Type | Function |
|---|---|
| Source | open_instruct/ground_truth_utils.py:L925-962
|
| Dependencies | open_instruct.ground_truth_utils, litellm, requests, numpy |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete factory function for constructing all available reward verifiers for RLVR training, provided by the Open Instruct library.
Description
build_all_verifiers() is a factory function that instantiates every registered verifier subclass and returns them as a dictionary mapping dataset names to verifier instances. The function:
- Iterates over all subclasses of
VerifierFunction(exceptLMJudgeVerifier, which is handled separately). - For each subclass, constructs its configuration from the experiment arguments and streaming config using
VerifierConfig.from_args(). - Instantiates the verifier and registers it by its lowercase name.
- Special-cases
CodeVerifierto also create acode_stdiovariant with a modified API endpoint. - Iterates over all judge prompt types in
JUDGE_PROMPT_MAPand createsLMJudgeVerifierinstances for each. - Applies optional verifier remapping (e.g., redirecting "math_hard" to use the "math" verifier).
The resulting dictionary is used during reward computation to look up the appropriate verifier for each example based on its verifier_source field.
Usage
Call this function during GRPO initialization to build the verifier registry. The returned dictionary is then passed to the reward computation pipeline (typically running inside the LLMRayActor or the DataPreparationActor).
Code Reference
Source Location
- Repository: Open Instruct
- File:
open_instruct/ground_truth_utils.py
Signature
def build_all_verifiers(
args,
streaming_config=None,
) -> dict[str, VerifierFunction]:
Import
from open_instruct.ground_truth_utils import build_all_verifiers
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
args |
ExperimentConfig (or compatible) |
Main experiment configuration providing base settings for verifier construction. |
streaming_config |
None | Optional streaming config providing reward-specific fields (code API URL, LLM judge model, verification reward value, remap configuration). |
Outputs
| Name | Type | Description |
|---|---|---|
| Return value | dict[str, VerifierFunction] |
Dictionary mapping lowercase dataset/verifier names to verifier instances. Typical keys include: "math", "ifeval", "code", "code_stdio", "llm_judge_*" (various judge types).
|
Verifier Types
| Verifier Class | Name | Description |
|---|---|---|
MathVerifier |
math |
Checks mathematical equivalence between model answer and ground truth using multiple normalization strategies. |
IFEvalVerifier |
ifeval |
Verifies instruction-following constraints (word count, format, content requirements). |
CodeVerifier |
code |
Executes generated code against test cases via an external API. |
CodeVerifier |
code_stdio |
Variant of code verifier using stdin/stdout test format. |
LMJudgeVerifier |
llm_judge_* |
Uses an LLM (e.g., GPT-4o-mini) to judge response quality. |
Usage Examples
from open_instruct.ground_truth_utils import build_all_verifiers
from open_instruct.grpo_utils import ExperimentConfig
from open_instruct.data_loader import StreamingDataLoaderConfig
args = ExperimentConfig()
streaming_config = StreamingDataLoaderConfig(
apply_verifiable_reward=True,
verification_reward=10.0,
code_api_url="http://code-executor:1234/test_program",
llm_judge_model="azure/gpt-4o-mini-standard",
)
verifiers = build_all_verifiers(args, streaming_config)
print(f"Available verifiers: {list(verifiers.keys())}")
# Output: Available verifiers: ['math', 'ifeval', 'code', 'code_stdio', 'llm_judge_default', ...]
# Use a verifier to score a response
math_verifier = verifiers["math"]
reward = math_verifier(
tokenized_prediction=[1, 2, 3],
prediction="The answer is \\boxed{42}",
label=["42"],
query="What is 6 * 7?",
)