Principle:Facebookresearch Habitat lab Low level Skill Training

Knowledge Sources	Habitat 2.0 PPO Habitat-Lab
Domains	Hierarchical_RL, Reinforcement_Learning
Last Updated	2026-02-15 02:00 GMT

Overview

Independent PPO training of individual manipulation skills (navigation, pick, place) that form the building blocks of a hierarchical policy.

Description

Low-level Skill Training trains each atomic skill (navigate-to-object, pick, place, open, close) as an independent RL policy using PPO. Each skill operates on a subset of the full action space and receives task-specific reward shaping. The resulting checkpoints are later loaded into the hierarchical policy as frozen sub-policies.

Skills are defined by the SkillPolicy abstract base class, which specifies the observation filtering, action sub-space mapping, and termination condition interfaces.

Usage

Train individual skills before assembling the hierarchical policy. Each skill is trained with its own config variant (e.g., rl_skill.yaml with task-specific overrides).

Theoretical Basis

The skill decomposition follows the options framework:

Skill = Option: Each skill is a temporally extended action with its own policy, initiation set, and termination condition
Independent training: Skills are trained in isolation with shaped rewards specific to their sub-task
Composability: Trained skills are composed by a high-level policy that selects which skill to execute

Pseudo-code:

# Train each skill independently
for skill_name in ["nav", "pick", "place"]:
    config = get_config(f"rearrange/rl_skill.yaml", overrides=[f"task={skill_name}"])
    trainer = PPOTrainer(config)
    trainer.train()
    # Produces checkpoint: checkpoints/{skill_name}.pth

Related Pages

Implemented By

Implementation:Facebookresearch_Habitat_lab_PPOTrainer_train_skill

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment