Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Isaac sim IsaacGymEnvs HRLAgent

From Leeroopedia
Knowledge Sources
Domains Hierarchical_Reinforcement_Learning, Motion_Imitation
Last Updated 2026-02-15 11:00 GMT

Overview

HRLAgent is a hierarchical reinforcement learning agent that combines a high-level policy producing latent commands with a pre-trained low-level controller (LLC) that translates those latents into physical joint actions.

Description

The HRLAgent class extends CommonAgent to implement a two-level hierarchical control structure. The high-level policy operates in a latent action space, outputting latent vectors that encode desired motion styles or behaviors. These latent vectors are then consumed by a pre-trained low-level controller (LLC) -- typically an AMP-trained policy -- which maps the latent commands into actual joint-level actions for the simulated character.

During initialization, the agent reads the LLC configuration from a YAML file to determine the latent dimension, then loads the LLC from a checkpoint file via _build_llc(). The LLC is built using a GenAMPBuilder network and GenAMPAgent, restored from the checkpoint, and set to evaluation mode. The high-level agent's action space is overridden to match the latent dimension rather than the full joint action space.

The key method env_step() implements the hierarchical stepping logic: for each high-level action, it runs the LLC for _llc_steps inner environment steps. At each inner step, _compute_llc_action() normalizes the high-level latent action and passes it along with the current observation (minus task-specific features) through the LLC actor to produce joint actions. Rewards are averaged across inner steps, and done flags are aggregated with an OR logic (any done within the inner loop triggers a done for the outer step).

Usage

Use HRLAgent when training hierarchical policies where the low-level motor control is handled by a pre-trained AMP or similar controller. This is particularly relevant for complex locomotion tasks (such as HumanoidASE) where the high-level policy selects behaviors or goals and relies on the LLC for physically realistic execution. The agent requires a trained LLC checkpoint and its corresponding configuration file.

Code Reference

Source Location

Signature

class HRLAgent(common_agent.CommonAgent):
    def __init__(self, base_name, config):
    def env_step(self, actions):
    def cast_obs(self, obs):
    def preprocess_actions(self, actions):
    def _setup_action_space(self):
    def _build_llc(self, config_params, checkpoint_file):
    def _build_llc_agent_config(self, config_params, network):
    def _compute_llc_action(self, obs, actions):
    def _extract_llc_obs(self, obs):

Import

from isaacgymenvs.learning.hrl_continuous import HRLAgent

I/O Contract

Inputs

Name Type Required Description
base_name str Yes Base name identifier for the agent
config dict Yes Configuration dictionary containing llc_config (path to LLC YAML), llc_checkpoint (path to LLC weights), and llc_steps (number of inner steps per high-level action)
actions torch.Tensor Yes High-level latent action tensor passed to env_step(), shape (num_envs, latent_dim)

Outputs

Name Type Description
obs dict Observation tensors after executing inner LLC steps
rewards torch.Tensor Averaged rewards over the inner LLC steps, shape (num_envs, 1)
dones torch.Tensor Aggregated done flags (1.0 if any inner step triggered done), shape (num_envs,)
infos dict Info dictionary from the last inner environment step

Usage Examples

# HRLAgent is typically instantiated by the rl_games runner via config.
# The training config specifies the agent class and LLC details:

# In the training YAML config:
# params:
#   config:
#     player:
#       class_name: isaacgymenvs.learning.hrl_continuous.HRLAgent
#     llc_config: "isaacgymenvs/cfg/train/HumanoidAMPPPO.yaml"
#     llc_checkpoint: "runs/HumanoidAMP/nn/HumanoidAMP.pth"
#     llc_steps: 5

# Manual instantiation example:
from isaacgymenvs.learning.hrl_continuous import HRLAgent

config = {
    'llc_config': 'path/to/llc_config.yaml',
    'llc_checkpoint': 'path/to/llc_checkpoint.pth',
    'llc_steps': 5,
    # ... other rl_games config params
}
agent = HRLAgent('hrl_humanoid', config)

# During training, env_step is called with latent actions
obs, rewards, dones, infos = agent.env_step(latent_actions)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment