Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Facebookresearch Habitat lab Evaluator

From Leeroopedia
Revision as of 12:34, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Facebookresearch_Habitat_lab_Evaluator.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Embodied_AI, Evaluation
Last Updated 2026-02-15 00:00 GMT

Overview

The Evaluator module defines an abstract Evaluator interface for running evaluation loops over checkpoints and provides a pause_envs utility function for pausing completed environments during evaluation.

Description

Evaluator is an abstract base class with a single abstract method evaluate_agent that receives a loaded agent, vectorized environments, configuration, and logging utilities, and runs the evaluation loop. Subclasses implement environment-specific or project-specific evaluation logic.

The pause_envs function handles the bookkeeping needed when some environments finish their episodes during evaluation. It pauses the specified environments, re-indexes the recurrent hidden states, masks, rewards, previous actions, observation batches, and RGB frame lists to exclude the paused environments, keeping the remaining active environments contiguous.

Usage

Subclass Evaluator to implement custom evaluation loops. Use pause_envs within evaluation loops to handle environments that complete their episodes before others.

Code Reference

Source Location

Signature

class Evaluator(abc.ABC):
    @abc.abstractmethod
    def evaluate_agent(
        self,
        agent: AgentAccessMgr,
        envs: VectorEnv,
        config: "DictConfig",
        checkpoint_index: int,
        step_id: int,
        writer: TensorboardWriter,
        device: torch.device,
        obs_transforms: List[ObservationTransformer],
        env_spec: EnvironmentSpec,
        rank0_keys: Set[str],
    ) -> None:

def pause_envs(
    envs_to_pause: List[int],
    envs: VectorEnv,
    test_recurrent_hidden_states: Tensor,
    not_done_masks: Tensor,
    current_episode_reward: Tensor,
    prev_actions: Tensor,
    batch: Dict[str, Tensor],
    rgb_frames: Union[List[List[Any]], List[List[ndarray]]],
) -> Tuple[VectorEnv, Tensor, Tensor, Tensor, Tensor, Dict[str, Tensor], List[List[Any]]]:

Import

from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs

I/O Contract

Inputs (evaluate_agent)

Name Type Required Description
agent AgentAccessMgr Yes Loaded agent with policy to evaluate
envs VectorEnv Yes Vectorized environments for evaluation
config DictConfig Yes Evaluation configuration
checkpoint_index int Yes ID of the checkpoint for logging
step_id int Yes Training step of the checkpoint for logging
writer TensorboardWriter Yes Logger for recording evaluation metrics
device torch.device Yes PyTorch device for evaluation
obs_transforms List[ObservationTransformer] Yes Observation transformations for the policy
env_spec EnvironmentSpec Yes Environment action/observation space specifications
rank0_keys Set[str] Yes Info dict keys that should only be recorded on rank 0

Inputs (pause_envs)

Name Type Required Description
envs_to_pause List[int] Yes Indices of environments to pause
envs VectorEnv Yes The vectorized environments
test_recurrent_hidden_states Tensor Yes RNN hidden states for all environments
not_done_masks Tensor Yes Mask tensor indicating active episodes
current_episode_reward Tensor Yes Accumulated rewards per environment
prev_actions Tensor Yes Previous action tensor
batch Dict[str, Tensor] Yes Current observation batch dictionary
rgb_frames List[List] Yes Collected RGB frames per environment

Outputs (pause_envs)

Name Type Description
envs VectorEnv Updated vectorized environments with paused entries removed
test_recurrent_hidden_states Tensor Re-indexed hidden states
not_done_masks Tensor Re-indexed masks
current_episode_reward Tensor Re-indexed rewards
prev_actions Tensor Re-indexed previous actions
batch Dict[str, Tensor] Re-indexed observation batch
rgb_frames List[List] Re-indexed frame lists

Usage Examples

Basic Usage

from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs

class MyEvaluator(Evaluator):
    def evaluate_agent(
        self, agent, envs, config, checkpoint_index, step_id,
        writer, device, obs_transforms, env_spec, rank0_keys,
    ):
        observations = envs.reset()
        # Run evaluation loop
        while envs.num_envs > 0:
            actions = agent.actor_critic.act(observations)
            outputs = envs.step(actions)

            # Pause finished environments
            envs_to_pause = [i for i, done in enumerate(dones) if done]
            (envs, hidden_states, masks, rewards,
             prev_actions, batch, rgb_frames) = pause_envs(
                envs_to_pause, envs, hidden_states, masks,
                rewards, prev_actions, batch, rgb_frames,
            )

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment