Implementation:Isaac sim IsaacGymEnvs Task Design Specification

Knowledge Sources	IsaacGymEnvs Isaac Gym Docs
Type	Pattern Doc
Domains	Design, Reinforcement_Learning
Last Updated	2026-02-15 00:00 GMT

Overview

Concrete pattern for writing a task design specification document that captures all MDP design decisions before implementing a custom IsaacGymEnvs RL environment.

Description

A Task Design Specification is the design artifact produced before writing task code. It enumerates every component of the MDP: the observation vector layout, action interpretation, reward formula, reset conditions, physics parameters, and required assets. This document serves as both a development guide and a reference for debugging and iteration.

The specification pattern is derived from reference implementations in the IsaacGymEnvs repository, particularly Cartpole (minimal example) and Ant (complex example).

Usage

Create a task design specification before implementing any new RL environment. Refer back to it during implementation to verify that all components are correctly coded.

Code Reference

Source Location

Repository: NVIDIA-Omniverse/IsaacGymEnvs
Files: docs/framework.md (L1-225), docs/rl_examples.md (L1-563)

Design Specification Template

The following template captures all required design decisions for a new task:

Task Name: MyCustomTask
Physics Engine: PhysX | Flex

=== Dimensions ===
num_obs: <integer>         # Observation vector dimension
num_actions: <integer>     # Action vector dimension
num_envs: <integer>        # Default parallel environments

=== Observation Vector ===
Index Range | Component           | Dimension | Range        | Description
0..2        | object_position     | 3         | [-inf, inf]  | XYZ position of target object
3..5        | object_velocity     | 3         | [-inf, inf]  | Linear velocity of target
6..N        | joint_positions     | N-6       | [lo, hi]     | Robot joint angles (radians)
...

=== Action Vector ===
Index Range | Component           | Dimension | Range        | Interpretation
0..M        | joint_torques       | M         | [-1, 1]      | Normalized torques, scaled by max_effort

=== Reward Function ===
reward = w1 * reward_component_1
       + w2 * reward_component_2
       - w3 * penalty_component_1

Where:
  reward_component_1: description and formula
  reward_component_2: description and formula
  penalty_component_1: description and formula

=== Reset Conditions ===
reset = (condition_1) OR (condition_2) OR (progress_buf >= max_episode_length)

Where:
  condition_1: description (e.g., object falls below table)
  condition_2: description (e.g., robot joint limits exceeded)

=== Required Assets ===
- asset_1.urdf: description
- asset_2.urdf: description

Reference: Cartpole Design Specification

Field	Value
Task Name	Cartpole
Physics Engine	PhysX
num_obs	4
num_actions	1
num_envs	512 (default)

Observation Vector

Index	Component	Range	Description
0	cart_position	[-3.0, 3.0]	Horizontal position of the cart on the rail
1	cart_velocity	[-inf, inf]	Linear velocity of the cart
2	pole_angle	[-pi, pi]	Angle of the pole from vertical
3	pole_angular_velocity	[-inf, inf]	Angular velocity of the pole

Action Vector

Index	Component	Range	Interpretation
0	cart_force	[-1.0, 1.0]	Horizontal force applied to the cart, scaled by `max_push_effort` (400.0)

Reward Function

# Cartpole reward: keep the pole upright and cart centered
reward = 1.0  # alive bonus each timestep
reward -= cart_position * cart_position * 0.01  # penalize cart displacement
reward -= pole_angle * pole_angle * 0.1  # penalize pole angle from vertical

Reset Conditions

reset = (abs(cart_position) > reset_dist) |     # cart too far from center
        (abs(pole_angle) > max_pole_angle) |     # pole fallen too far
        (progress_buf >= max_episode_length)      # episode timeout

Required Assets

cartpole.urdf: Cart-pole system with 1 prismatic joint (cart) and 1 revolute joint (pole)

Reference: Ant Design Specification

Field	Value
Task Name	Ant
Physics Engine	PhysX
num_obs	60
num_actions	8
num_envs	2048 (default)

Observation Vector (60 dimensions)

Index Range	Component	Dim	Description
0..12	DOF positions	13	Joint angles for all 8 DOFs plus torso orientation
13..25	DOF velocities	13	Joint angular velocities
26..28	Torso velocity	3	Linear velocity of the torso body
29..31	Torso angular velocity	3	Angular velocity of the torso body
32..35	Gravity projection	4	Gravity vector projected into torso frame
36..59	Foot contact forces	24	Contact sensor readings for each foot (6 values per foot x 4 feet)

Reward Function

reward = (velocity_toward_target * progress_weight     # forward progress
        + alive_bonus                                    # survival reward
        - energy_cost * energy_weight                    # penalize high torques
        - joints_at_limit_cost * limit_weight)           # penalize joint limits

I/O Contract

Inputs

Name	Type	Required	Description
Task concept	Text	Yes	High-level description of what the agent should learn
Robot type	URDF/MJCF	Yes	Robot model with defined joints, bodies, and sensors
Objects	URDF/MJCF	No	Additional objects in the scene (targets, obstacles)

Outputs

Name	Type	Description
Design specification	Document	Complete MDP specification following the template above
num_obs	Integer	Total observation vector dimension
num_actions	Integer	Total action vector dimension
Reward formula	Mathematical expression	Weighted combination of reward and penalty components
Reset conditions	Boolean expression	Conditions triggering episode termination

Related Pages

Isaac_sim_IsaacGymEnvs_Task_Requirements_Design - implements - Principle defining the design methodology for task specification.
Isaac_sim_IsaacGymEnvs_VecTask_Subclass_Pattern - next step - Implement the designed specification as a VecTask subclass.
Isaac_sim_IsaacGymEnvs_Hydra_Task_Train_YAML - next step - Encode design parameters in YAML configuration files.

Principle:Isaac_sim_IsaacGymEnvs_Task_Requirements_Design

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment