Implementation:Alibaba ROLL AgenticPipeline Val
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Evaluation, Agentic_AI |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
Concrete validation evaluation method for the agentic RL pipeline provided by the Alibaba ROLL library.
Description
The AgenticPipeline.val method runs validation episodes using the validation RolloutScheduler, collects episode scores, groups them by environment tag, and computes per-tag statistics (mean, max, min). It resets the validation dataset manager before each evaluation round.
Usage
Called by the agentic pipeline at configured evaluation intervals.
Code Reference
Source Location
- Repository: Alibaba ROLL
- File: roll/pipeline/agentic/agentic_pipeline.py
- Lines: L578-612
Signature
def val(self, global_step: int) -> Dict[str, float]:
"""
Run validation evaluation loop.
Args:
global_step: Current global training step
Returns:
Dict with validation metrics:
- val/score/mean, val/score/max, val/score/min
- {tag}/score/mean, {tag}/score/max, {tag}/score/min per environment
- time/get_batch_cost_val
"""
Import
from roll.pipeline.agentic.agentic_pipeline import AgenticPipeline
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| global_step | int | Yes | Current training step for metric labeling |
Outputs
| Name | Type | Description |
|---|---|---|
| metrics | Dict[str, float] | Validation scores per environment and aggregate |
Usage Examples
# Called within the pipeline's run() method:
if step % eval_steps == 0:
val_metrics = pipeline.val(global_step=step)
print(val_metrics)
# {"val/score/mean": 0.75, "sokoban/score/mean": 0.8, "frozenlake/score/mean": 0.7}
Related Pages
Implements Principle
Requires Environment
Environment Dependencies
This implementation requires the following environment constraints:
Heuristics Applied
No specific heuristics apply to this implementation.
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment