Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Facebookresearch Habitat lab PPOTrainer train skill

From Leeroopedia
Knowledge Sources
Domains Hierarchical_RL, Training
Last Updated 2026-02-15 02:00 GMT

Overview

PPOTrainer configured for per-skill training in rearrangement tasks, producing skill checkpoint files consumed by the hierarchical policy.

Description

The same PPOTrainer.train method is used but with skill-specific configs from rl_skill.yaml. Each skill trains on a reward function shaped for its sub-task (e.g., distance-to-object for navigation, grasp success for pick). The resulting checkpoint contains a policy that operates on a filtered subset of observations and a mapped action sub-space, compatible with NnSkillPolicy loading.

Usage

Run PPOTrainer.train() with skill-specific config for each skill that needs to be learned. The trained checkpoint path is then referenced in the hierarchical policy config.

Code Reference

Source Location

  • Repository: habitat-lab
  • File: habitat-baselines/habitat_baselines/rl/ppo/ppo_trainer.py (training loop), habitat-baselines/habitat_baselines/rl/hrl/skills/skill.py (SkillPolicy base), habitat-baselines/habitat_baselines/rl/hrl/skills/nn_skill.py (NnSkillPolicy)
  • Lines: L655-801 (PPOTrainer.train), skills/skill.py:L20-74 (SkillPolicy.__init__)

Signature

# Same PPOTrainer.train() but invoked with skill config
class PPOTrainer(BaseRLTrainer):
    def train(self) -> None:
        """Train a single skill policy using PPO."""

# Skill checkpoint loaded later via:
class NnSkillPolicy(SkillPolicy):
    def __init__(
        self,
        wrap_policy,
        config,
        action_space: spaces.Space,
        filtered_obs_space: spaces.Space,
        filtered_action_space: spaces.Space,
        batch_size: int,
        pddl_domain_def=None,
        num_envs=None,
    ):

Import

from habitat_baselines.rl.ppo.ppo_trainer import PPOTrainer
from habitat_baselines.rl.hrl.skills.nn_skill import NnSkillPolicy

I/O Contract

Inputs

Name Type Required Description
config DictConfig Yes Skill-specific config (e.g., rl_skill.yaml with task overrides)

Outputs

Name Type Description
Skill checkpoint .pth file Trained skill policy checkpoint loadable by NnSkillPolicy

Usage Examples

Train Navigation Skill

python -u habitat-baselines/habitat_baselines/run.py \
    --exp-config habitat-baselines/habitat_baselines/config/rearrange/rl_skill.yaml \
    --run-type train \
    habitat_baselines.rl.policy.name=PointNavResNetPolicy \
    habitat.task.type=NavToObjTask-v0

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment