Implementation:Facebookresearch Habitat lab PPOTrainer train skill
| Knowledge Sources | |
|---|---|
| Domains | Hierarchical_RL, Training |
| Last Updated | 2026-02-15 02:00 GMT |
Overview
PPOTrainer configured for per-skill training in rearrangement tasks, producing skill checkpoint files consumed by the hierarchical policy.
Description
The same PPOTrainer.train method is used but with skill-specific configs from rl_skill.yaml. Each skill trains on a reward function shaped for its sub-task (e.g., distance-to-object for navigation, grasp success for pick). The resulting checkpoint contains a policy that operates on a filtered subset of observations and a mapped action sub-space, compatible with NnSkillPolicy loading.
Usage
Run PPOTrainer.train() with skill-specific config for each skill that needs to be learned. The trained checkpoint path is then referenced in the hierarchical policy config.
Code Reference
Source Location
- Repository: habitat-lab
- File: habitat-baselines/habitat_baselines/rl/ppo/ppo_trainer.py (training loop), habitat-baselines/habitat_baselines/rl/hrl/skills/skill.py (SkillPolicy base), habitat-baselines/habitat_baselines/rl/hrl/skills/nn_skill.py (NnSkillPolicy)
- Lines: L655-801 (PPOTrainer.train), skills/skill.py:L20-74 (SkillPolicy.__init__)
Signature
# Same PPOTrainer.train() but invoked with skill config
class PPOTrainer(BaseRLTrainer):
def train(self) -> None:
"""Train a single skill policy using PPO."""
# Skill checkpoint loaded later via:
class NnSkillPolicy(SkillPolicy):
def __init__(
self,
wrap_policy,
config,
action_space: spaces.Space,
filtered_obs_space: spaces.Space,
filtered_action_space: spaces.Space,
batch_size: int,
pddl_domain_def=None,
num_envs=None,
):
Import
from habitat_baselines.rl.ppo.ppo_trainer import PPOTrainer
from habitat_baselines.rl.hrl.skills.nn_skill import NnSkillPolicy
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| config | DictConfig | Yes | Skill-specific config (e.g., rl_skill.yaml with task overrides)
|
Outputs
| Name | Type | Description |
|---|---|---|
| Skill checkpoint | .pth file | Trained skill policy checkpoint loadable by NnSkillPolicy |
Usage Examples
python -u habitat-baselines/habitat_baselines/run.py \
--exp-config habitat-baselines/habitat_baselines/config/rearrange/rl_skill.yaml \
--run-type train \
habitat_baselines.rl.policy.name=PointNavResNetPolicy \
habitat.task.type=NavToObjTask-v0