Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Facebookresearch Habitat lab IL Metrics

From Leeroopedia
Revision as of 12:35, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Facebookresearch_Habitat_lab_IL_Metrics.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Embodied_AI, Imitation_Learning
Last Updated 2026-02-15 00:00 GMT

Overview

The IL Metrics module provides Metric, VqaMetric, and NavMetric classes for tracking, averaging, and logging training and evaluation metrics in imitation learning pipelines for Embodied Question Answering.

Description

The Metric base class maintains a list of named metrics, each tracked with three statistics: cumulative mean (index 0), exponential moving average with decay 0.95 (index 1), and the most recent value (index 2). The update method accepts a list of values corresponding to each metric name and updates all three statistics. get_stat_string produces a formatted string of metric values, and get_stats returns the current values for a specified mode. dump_log writes the full history of statistics to a JSON file if a log path is configured.

VqaMetric extends Metric with a compute_ranks method that calculates answer accuracy and ranking positions from prediction scores and ground-truth labels. NavMetric extends Metric without additional methods, serving as a type-distinguished metric tracker for navigation tasks.

Usage

Use VqaMetric during VQA training/evaluation to track loss and accuracy with ranking. Use NavMetric for navigation-specific metrics. Use the base Metric for general-purpose metric tracking.

Code Reference

Source Location

Signature

class Metric:
    def __init__(self, info=None, metric_names=None, log_json=None):

class VqaMetric(Metric):
    def __init__(self, info=None, metric_names=None, log_json=None):
    def compute_ranks(
        self, scores: torch.Tensor, labels: torch.Tensor
    ) -> Tuple[np.ndarray, np.ndarray]:

class NavMetric(Metric):
    def __init__(self, info=None, metric_names=None, log_json=None):

Import

from habitat_baselines.il.metrics import Metric, VqaMetric, NavMetric

I/O Contract

Inputs (Metric.__init__)

Name Type Required Description
info dict No Metadata dictionary (e.g., epoch, split) displayed in stat strings
metric_names list No Sorted list of metric names to track
log_json str No File path for JSON log output; if None, logging to file is disabled

Inputs (VqaMetric.compute_ranks)

Name Type Required Description
scores torch.Tensor Yes Prediction scores tensor of shape (batch, num_answers)
labels torch.Tensor Yes Ground-truth label indices tensor of shape (batch,)

Outputs (VqaMetric.compute_ranks)

Name Type Description
accuracy np.ndarray Binary accuracy array (1 if rank == 1, else 0)
ranks np.ndarray Rank of the correct answer for each sample

Usage Examples

Basic Usage

from habitat_baselines.il.metrics import VqaMetric

metric = VqaMetric(
    info={"epoch": 1, "split": "train"},
    metric_names=["loss", "accuracy"],
    log_json="logs/vqa_train.json",
)

# During training loop
for batch_idx, batch in enumerate(dataloader):
    loss = compute_loss(model, batch)
    accuracy = compute_accuracy(model, batch)
    metric.update([loss.item(), accuracy])

# Print metrics
print(metric.get_stat_string(mode=1))  # EMA values

# Save log to JSON
metric.dump_log()

Computing Ranks

import torch
from habitat_baselines.il.metrics import VqaMetric

vqa_metric = VqaMetric(
    info={"split": "val"},
    metric_names=["loss", "accuracy", "mean_rank"],
)

scores = model(batch)          # (batch_size, num_answers)
labels = batch["answer"]       # (batch_size,)
accuracy, ranks = vqa_metric.compute_ranks(scores, labels)

vqa_metric.update([
    loss.item(),
    accuracy.mean(),
    ranks.mean(),
])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment