Implementation:Facebookresearch Habitat lab PPOTrainer train

Knowledge Sources	Habitat-Lab PPO
Domains	Reinforcement_Learning, Training
Last Updated	2026-02-15 02:00 GMT

Overview

Concrete training loop for PPO/DD-PPO agents in Habitat environments, provided by habitat-baselines. This is the primary PointNav training entry point.

Description

The PPOTrainer.train method implements the complete DD-PPO training loop for navigation agents. It initializes environments, builds the policy, and enters the main loop: collecting rollouts via `_compute_actions_and_step_envs`, updating the policy via `PPO.update`, logging metrics, and saving checkpoints. Supports distributed training, resume from checkpoint, and SLURM job requeuing.

Usage

Called by `execute_exp` when the run type is `train`. This is the default training method for PointNav, ObjectNav, and other single-policy RL tasks.

Code Reference

Source Location

Repository: habitat-lab
File: habitat-baselines/habitat_baselines/rl/ppo/ppo_trainer.py
Lines: L655-801 (train method), L343-399 (_compute_actions_and_step_envs), L489-522 (_update_agent)

Signature

class PPOTrainer(BaseRLTrainer):
    def train(self) -> None:
        """
        Main method for training DD/PPO.

        Initializes environments and policy, then enters the training loop:
        1. Collect rollouts from vectorized environments
        2. Compute advantages (GAE)
        3. Update policy via PPO clipped objective
        4. Log metrics and save checkpoints
        """

Import

from habitat_baselines.rl.ppo.ppo_trainer import PPOTrainer

I/O Contract

Inputs

Name	Type	Required	Description
self.config	DictConfig	Yes	Complete experiment config (set during __init__)
self.envs	VectorEnv	Yes	Vectorized environments (created in _init_train)
self._agent	PPO/DDPPO	Yes	PPO agent wrapping the policy (created in _init_train)

Outputs

Name	Type	Description
Checkpoints	.pth files	Saved policy checkpoints at `ckpt.{N}.pth`
Logs	TensorBoard/WandB	Training metrics: value_loss, action_loss, entropy, reward, SPL, etc.

Usage Examples

Launch PPO Training

from habitat_baselines.config.default import get_config
from habitat_baselines.rl.ppo.ppo_trainer import PPOTrainer

# Load config
config = get_config("pointnav/ppo_pointnav.yaml")

# Create trainer and run
trainer = PPOTrainer(config)
trainer.train()

Via Command Line

python -u habitat-baselines/habitat_baselines/run.py \
    --exp-config habitat-baselines/habitat_baselines/config/pointnav/ppo_pointnav.yaml \
    --run-type train

Related Pages

Implements Principle

Principle:Facebookresearch_Habitat_lab_Rollout_Collection_and_Training

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment