Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Facebookresearch Habitat lab Checkpointing and Evaluation

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Evaluation
Last Updated 2026-02-15 02:00 GMT

Overview

Systematic evaluation of trained navigation agents across held-out episodes, computing standard embodied AI metrics such as Success, SPL, and Distance-to-Goal.

Description

Checkpointing and Evaluation is the process of saving trained policy checkpoints during training and later evaluating them on a fixed set of episodes. Evaluation runs the agent in inference mode (no gradient computation) across episodes, collecting per-episode metrics and aggregating them into summary statistics.

Key evaluation metrics in embodied navigation:

  • Success: Binary indicator of whether the agent reached within a threshold distance of the goal
  • SPL (Success weighted by Path Length): Success normalized by the ratio of shortest path to actual path length
  • Distance to Goal: Euclidean distance from agent to goal at episode termination
  • Soft SPL: Continuous relaxation of SPL using progress toward the goal

Usage

Use this after training is complete (or at regular intervals during training) to measure agent performance. Standard practice evaluates on all episodes in the validation or test split.

Theoretical Basis

SPL metric definition (Anderson et al., 2018):

SPL=1Ni=1NSilimax(pi,li)

Where Si is the binary success indicator, li is the shortest path length, and pi is the agent's actual path length.

Evaluation loop pseudo-code:

# Abstract evaluation process
metrics = []
for episode in evaluation_episodes:
    observation = env.reset()
    agent.reset()
    while not done:
        action = agent.act(observation)
        observation, reward, done, info = env.step(action)
    metrics.append(info["metrics"])
aggregated = {k: mean(m[k] for m in metrics) for k in metric_keys}

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment