Implementation:Facebookresearch Habitat lab Evaluator

Knowledge Sources	Facebookresearch_Habitat_lab
Domains	Embodied_AI, Evaluation
Last Updated	2026-02-15 00:00 GMT

Overview

The Evaluator module defines an abstract Evaluator interface for running evaluation loops over checkpoints and provides a pause_envs utility function for pausing completed environments during evaluation.

Description

Evaluator is an abstract base class with a single abstract method evaluate_agent that receives a loaded agent, vectorized environments, configuration, and logging utilities, and runs the evaluation loop. Subclasses implement environment-specific or project-specific evaluation logic.

The pause_envs function handles the bookkeeping needed when some environments finish their episodes during evaluation. It pauses the specified environments, re-indexes the recurrent hidden states, masks, rewards, previous actions, observation batches, and RGB frame lists to exclude the paused environments, keeping the remaining active environments contiguous.

Usage

Subclass Evaluator to implement custom evaluation loops. Use pause_envs within evaluation loops to handle environments that complete their episodes before others.

Code Reference

Source Location

Repository: Facebookresearch_Habitat_lab
File: habitat-baselines/habitat_baselines/rl/ppo/evaluator.py
Lines: 22-105

Signature

class Evaluator(abc.ABC):
    @abc.abstractmethod
    def evaluate_agent(
        self,
        agent: AgentAccessMgr,
        envs: VectorEnv,
        config: "DictConfig",
        checkpoint_index: int,
        step_id: int,
        writer: TensorboardWriter,
        device: torch.device,
        obs_transforms: List[ObservationTransformer],
        env_spec: EnvironmentSpec,
        rank0_keys: Set[str],
    ) -> None:

def pause_envs(
    envs_to_pause: List[int],
    envs: VectorEnv,
    test_recurrent_hidden_states: Tensor,
    not_done_masks: Tensor,
    current_episode_reward: Tensor,
    prev_actions: Tensor,
    batch: Dict[str, Tensor],
    rgb_frames: Union[List[List[Any]], List[List[ndarray]]],
) -> Tuple[VectorEnv, Tensor, Tensor, Tensor, Tensor, Dict[str, Tensor], List[List[Any]]]:

Import

from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs

I/O Contract

Inputs (evaluate_agent)

Name	Type	Required	Description
agent	AgentAccessMgr	Yes	Loaded agent with policy to evaluate
envs	VectorEnv	Yes	Vectorized environments for evaluation
config	DictConfig	Yes	Evaluation configuration
checkpoint_index	int	Yes	ID of the checkpoint for logging
step_id	int	Yes	Training step of the checkpoint for logging
writer	TensorboardWriter	Yes	Logger for recording evaluation metrics
device	torch.device	Yes	PyTorch device for evaluation
obs_transforms	List[ObservationTransformer]	Yes	Observation transformations for the policy
env_spec	EnvironmentSpec	Yes	Environment action/observation space specifications
rank0_keys	Set[str]	Yes	Info dict keys that should only be recorded on rank 0

Inputs (pause_envs)

Name	Type	Required	Description
envs_to_pause	List[int]	Yes	Indices of environments to pause
envs	VectorEnv	Yes	The vectorized environments
test_recurrent_hidden_states	Tensor	Yes	RNN hidden states for all environments
not_done_masks	Tensor	Yes	Mask tensor indicating active episodes
current_episode_reward	Tensor	Yes	Accumulated rewards per environment
prev_actions	Tensor	Yes	Previous action tensor
batch	Dict[str, Tensor]	Yes	Current observation batch dictionary
rgb_frames	List[List]	Yes	Collected RGB frames per environment

Outputs (pause_envs)

Name	Type	Description
envs	VectorEnv	Updated vectorized environments with paused entries removed
test_recurrent_hidden_states	Tensor	Re-indexed hidden states
not_done_masks	Tensor	Re-indexed masks
current_episode_reward	Tensor	Re-indexed rewards
prev_actions	Tensor	Re-indexed previous actions
batch	Dict[str, Tensor]	Re-indexed observation batch
rgb_frames	List[List]	Re-indexed frame lists

Usage Examples

Basic Usage

from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs

class MyEvaluator(Evaluator):
    def evaluate_agent(
        self, agent, envs, config, checkpoint_index, step_id,
        writer, device, obs_transforms, env_spec, rank0_keys,
    ):
        observations = envs.reset()
        # Run evaluation loop
        while envs.num_envs > 0:
            actions = agent.actor_critic.act(observations)
            outputs = envs.step(actions)

            # Pause finished environments
            envs_to_pause = [i for i, done in enumerate(dones) if done]
            (envs, hidden_states, masks, rewards,
             prev_actions, batch, rgb_frames) = pause_envs(
                envs_to_pause, envs, hidden_states, masks,
                rewards, prev_actions, batch, rgb_frames,
            )

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment