Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL AgenticPipeline Val

From Leeroopedia


Knowledge Sources
Domains Reinforcement_Learning, Evaluation, Agentic_AI
Last Updated 2026-02-07 20:00 GMT

Overview

Concrete validation evaluation method for the agentic RL pipeline provided by the Alibaba ROLL library.

Description

The AgenticPipeline.val method runs validation episodes using the validation RolloutScheduler, collects episode scores, groups them by environment tag, and computes per-tag statistics (mean, max, min). It resets the validation dataset manager before each evaluation round.

Usage

Called by the agentic pipeline at configured evaluation intervals.

Code Reference

Source Location

  • Repository: Alibaba ROLL
  • File: roll/pipeline/agentic/agentic_pipeline.py
  • Lines: L578-612

Signature

def val(self, global_step: int) -> Dict[str, float]:
    """
    Run validation evaluation loop.

    Args:
        global_step: Current global training step

    Returns:
        Dict with validation metrics:
        - val/score/mean, val/score/max, val/score/min
        - {tag}/score/mean, {tag}/score/max, {tag}/score/min per environment
        - time/get_batch_cost_val
    """

Import

from roll.pipeline.agentic.agentic_pipeline import AgenticPipeline

I/O Contract

Inputs

Name Type Required Description
global_step int Yes Current training step for metric labeling

Outputs

Name Type Description
metrics Dict[str, float] Validation scores per environment and aggregate

Usage Examples

# Called within the pipeline's run() method:
if step % eval_steps == 0:
    val_metrics = pipeline.val(global_step=step)
    print(val_metrics)
    # {"val/score/mean": 0.75, "sokoban/score/mean": 0.8, "frozenlake/score/mean": 0.7}

Related Pages

Implements Principle

Requires Environment

Environment Dependencies

This implementation requires the following environment constraints:

Heuristics Applied

No specific heuristics apply to this implementation.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment