Implementation:Facebookresearch Habitat lab Evaluator
| Knowledge Sources | |
|---|---|
| Domains | Embodied_AI, Evaluation |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
The Evaluator module defines an abstract Evaluator interface for running evaluation loops over checkpoints and provides a pause_envs utility function for pausing completed environments during evaluation.
Description
Evaluator is an abstract base class with a single abstract method evaluate_agent that receives a loaded agent, vectorized environments, configuration, and logging utilities, and runs the evaluation loop. Subclasses implement environment-specific or project-specific evaluation logic.
The pause_envs function handles the bookkeeping needed when some environments finish their episodes during evaluation. It pauses the specified environments, re-indexes the recurrent hidden states, masks, rewards, previous actions, observation batches, and RGB frame lists to exclude the paused environments, keeping the remaining active environments contiguous.
Usage
Subclass Evaluator to implement custom evaluation loops. Use pause_envs within evaluation loops to handle environments that complete their episodes before others.
Code Reference
Source Location
- Repository: Facebookresearch_Habitat_lab
- File: habitat-baselines/habitat_baselines/rl/ppo/evaluator.py
- Lines: 22-105
Signature
class Evaluator(abc.ABC):
@abc.abstractmethod
def evaluate_agent(
self,
agent: AgentAccessMgr,
envs: VectorEnv,
config: "DictConfig",
checkpoint_index: int,
step_id: int,
writer: TensorboardWriter,
device: torch.device,
obs_transforms: List[ObservationTransformer],
env_spec: EnvironmentSpec,
rank0_keys: Set[str],
) -> None:
def pause_envs(
envs_to_pause: List[int],
envs: VectorEnv,
test_recurrent_hidden_states: Tensor,
not_done_masks: Tensor,
current_episode_reward: Tensor,
prev_actions: Tensor,
batch: Dict[str, Tensor],
rgb_frames: Union[List[List[Any]], List[List[ndarray]]],
) -> Tuple[VectorEnv, Tensor, Tensor, Tensor, Tensor, Dict[str, Tensor], List[List[Any]]]:
Import
from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs
I/O Contract
Inputs (evaluate_agent)
| Name | Type | Required | Description |
|---|---|---|---|
| agent | AgentAccessMgr | Yes | Loaded agent with policy to evaluate |
| envs | VectorEnv | Yes | Vectorized environments for evaluation |
| config | DictConfig | Yes | Evaluation configuration |
| checkpoint_index | int | Yes | ID of the checkpoint for logging |
| step_id | int | Yes | Training step of the checkpoint for logging |
| writer | TensorboardWriter | Yes | Logger for recording evaluation metrics |
| device | torch.device | Yes | PyTorch device for evaluation |
| obs_transforms | List[ObservationTransformer] | Yes | Observation transformations for the policy |
| env_spec | EnvironmentSpec | Yes | Environment action/observation space specifications |
| rank0_keys | Set[str] | Yes | Info dict keys that should only be recorded on rank 0 |
Inputs (pause_envs)
| Name | Type | Required | Description |
|---|---|---|---|
| envs_to_pause | List[int] | Yes | Indices of environments to pause |
| envs | VectorEnv | Yes | The vectorized environments |
| test_recurrent_hidden_states | Tensor | Yes | RNN hidden states for all environments |
| not_done_masks | Tensor | Yes | Mask tensor indicating active episodes |
| current_episode_reward | Tensor | Yes | Accumulated rewards per environment |
| prev_actions | Tensor | Yes | Previous action tensor |
| batch | Dict[str, Tensor] | Yes | Current observation batch dictionary |
| rgb_frames | List[List] | Yes | Collected RGB frames per environment |
Outputs (pause_envs)
| Name | Type | Description |
|---|---|---|
| envs | VectorEnv | Updated vectorized environments with paused entries removed |
| test_recurrent_hidden_states | Tensor | Re-indexed hidden states |
| not_done_masks | Tensor | Re-indexed masks |
| current_episode_reward | Tensor | Re-indexed rewards |
| prev_actions | Tensor | Re-indexed previous actions |
| batch | Dict[str, Tensor] | Re-indexed observation batch |
| rgb_frames | List[List] | Re-indexed frame lists |
Usage Examples
Basic Usage
from habitat_baselines.rl.ppo.evaluator import Evaluator, pause_envs
class MyEvaluator(Evaluator):
def evaluate_agent(
self, agent, envs, config, checkpoint_index, step_id,
writer, device, obs_transforms, env_spec, rank0_keys,
):
observations = envs.reset()
# Run evaluation loop
while envs.num_envs > 0:
actions = agent.actor_critic.act(observations)
outputs = envs.step(actions)
# Pause finished environments
envs_to_pause = [i for i, done in enumerate(dones) if done]
(envs, hidden_states, masks, rewards,
prev_actions, batch, rgb_frames) = pause_envs(
envs_to_pause, envs, hidden_states, masks,
rewards, prev_actions, batch, rgb_frames,
)