Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Facebookresearch Habitat lab HabitatEvaluator evaluate agent

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Evaluation
Last Updated 2026-02-15 02:00 GMT

Overview

Concrete evaluation loop for RL agents in Habitat environments, computing navigation metrics across episodes with optional video recording, provided by habitat-baselines.

Description

The HabitatEvaluator.evaluate_agent method runs a trained agent through evaluation episodes in vectorized environments. It handles batched inference with recurrent hidden states, collects per-episode metrics (success, SPL, distance_to_goal, etc.), supports video recording for visualization, and aggregates statistics across all episodes.

Usage

Called by `PPOTrainer._eval_checkpoint` after loading a trained checkpoint. Used for final evaluation on validation/test splits and for mid-training evaluation checkpoints.

Code Reference

Source Location

  • Repository: habitat-lab
  • File: habitat-baselines/habitat_baselines/rl/ppo/habitat_evaluator.py
  • Lines: L39-340

Signature

class HabitatEvaluator(Evaluator):
    def evaluate_agent(
        self,
        agent,
        envs,
        config,
        checkpoint_index,
        step_id,
        writer,
        device,
        obs_transforms,
        env_spec,
        rank0_keys,
    ):
        """
        Evaluate agent across episodes.

        Args:
            agent: Trained policy agent
            envs: Vectorized evaluation environments
            config: Evaluation config
            checkpoint_index: Index of checkpoint being evaluated
            step_id: Training step for logging
            writer: TensorBoard/WandB writer
            device: torch device
            obs_transforms: Observation transforms to apply
            env_spec: Environment specification
            rank0_keys: Keys to log only on rank 0
        """

Import

from habitat_baselines.rl.ppo.habitat_evaluator import HabitatEvaluator

I/O Contract

Inputs

Name Type Required Description
agent PPO Yes Trained policy agent with `actor_critic` attribute
envs VectorEnv Yes Vectorized evaluation environments
config DictConfig Yes Evaluation configuration
checkpoint_index int Yes Index of checkpoint being evaluated
device torch.device Yes Device for inference

Outputs

Name Type Description
Metrics Dict[str, float] Aggregated metrics: distance_to_goal, success, spl, soft_spl
Videos .mp4 files Optional evaluation videos saved to video_dir

Usage Examples

Evaluate a Checkpoint

from habitat_baselines.rl.ppo.habitat_evaluator import HabitatEvaluator

evaluator = HabitatEvaluator()
evaluator.evaluate_agent(
    agent=trained_agent,
    envs=eval_envs,
    config=eval_config,
    checkpoint_index=0,
    step_id=1000000,
    writer=tb_writer,
    device=torch.device("cuda"),
    obs_transforms=obs_transforms,
    env_spec=env_spec,
    rank0_keys=set(),
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment