Implementation:ARISE Initiative Robosuite GymWrapper
Metadata
| Property | Value |
|---|---|
| Sources | robosuite, Gymnasium |
| Domains | Reinforcement_Learning, API_Compatibility |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
Concrete adapter for wrapping robosuite environments into Gymnasium-compatible interfaces provided by the robosuite wrappers module.
Description
The GymWrapper class provides a concrete implementation of the Gymnasium Environment Wrapping adapter pattern. It inherits from both the robosuite Wrapper base class and the gym.Env interface, creating a dual-interface object that can be used seamlessly with both robosuite and Gymnasium ecosystems.
Key implementation details:
- Observation Space Computation: The wrapper computes a Box observation_space by sampling the environment, selecting specified observation keys, flattening them into a 1D array, and creating a Box space with appropriate dimensions
- Action Space Computation: The Box action_space is derived directly from the environment's action_spec bounds
- Observation Flattening: The _flatten_obs() method concatenates selected observation keys (default: 'robot0_proprio-state' and 'object-state') into a single 1D numpy array
- Gymnasium Reset Signature: The reset() method accepts optional seed and options parameters and returns a 2-tuple (obs, info)
- Gymnasium Step Signature: The step() method returns a 5-tuple (obs, reward, terminated, truncated, info), splitting the legacy 'done' flag into separate termination conditions
The wrapper is designed to be transparent, adding minimal overhead while ensuring full compatibility with the Gymnasium API standard.
Usage
Wrap any robosuite environment to use with Gymnasium-compatible RL libraries such as:
- Stable-Baselines3: For training PPO, SAC, TD3, and other standard RL algorithms
- CleanRL: For single-file RL implementations with minimal dependencies
- RLlib: For distributed RL training at scale
- Custom Training Loops: Any code that expects the standard Gymnasium interface
The wrapper handles all interface translation automatically, allowing you to focus on algorithm development rather than environment compatibility.
Code Reference
| Property | Value |
|---|---|
| Source | robosuite |
| File | robosuite/wrappers/gym_wrapper.py |
| Lines | L26-184 |
| Import | from robosuite.wrappers import GymWrapper
|
Constructor Signature
class GymWrapper:
def __init__(self, env, keys=None, flatten_obs=True):
"""
Initialize the Gymnasium wrapper for robosuite environments.
Args:
env (MujocoEnv): The environment to wrap
keys (None or list of str): Observation keys to include. If None, defaults
to proprio-state and object-state keys.
flatten_obs (bool): Whether to flatten observation dict to 1d array.
Defaults to True.
"""
Reset Method (L127-143)
def reset(self, seed=None, options=None):
"""
Reset the environment to initial state.
Args:
seed (int, optional): Random seed for reproducibility
options (dict, optional): Additional options for reset
Returns:
2-tuple: (np.array observations, dict info)
- observations: Flattened observation array
- info: Dictionary with auxiliary diagnostic information
"""
Step Method (L145-163)
def step(self, action):
"""
Execute one timestep of the environment dynamics.
Args:
action (np.array): Action to take in the environment
Returns:
5-tuple: (np.array obs, float reward, bool terminated, bool truncated, dict info)
- obs: Flattened observation array
- reward: Reward signal from the environment
- terminated: True if episode ended due to terminal state
- truncated: True if episode ended due to time limit
- info: Dictionary with auxiliary diagnostic information
"""
Close Method (L180-184)
def close(self):
"""
Wrapper for calling underlying environment close function.
Performs any necessary cleanup of environment resources.
"""
I/O Contract
Constructor Inputs
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| env | MujocoEnv | Yes | N/A | The robosuite environment instance to wrap |
| keys | list of str | No | ['robot0_proprio-state', 'object-state'] | Observation keys to include in flattened observation |
| flatten_obs | bool | No | True | Whether to flatten OrderedDict observations to 1D array |
Constructor Outputs
| Attribute | Type | Description |
|---|---|---|
| observation_space | gym.spaces.Box | Box space defining valid observations (continuous, 1D) |
| action_space | gym.spaces.Box | Box space defining valid actions (continuous) |
reset() Method
Inputs:
| Parameter | Type | Required | Description |
|---|---|---|---|
| seed | int | No | Random seed for environment reproducibility |
| options | dict | No | Additional environment-specific reset options |
Outputs:
| Element | Type | Description |
|---|---|---|
| observations | np.array | Flattened observation array matching observation_space |
| info | dict | Diagnostic information (typically empty on reset) |
step() Method
Inputs:
| Parameter | Type | Required | Description |
|---|---|---|---|
| action | np.array | Yes | Action array matching action_space dimensions |
Outputs:
| Element | Type | Description |
|---|---|---|
| obs | np.array | Next observation after taking action |
| reward | float | Reward signal for the transition |
| terminated | bool | True if episode reached terminal state (success/failure) |
| truncated | bool | True if episode ended due to time/step limit |
| info | dict | Diagnostic information (may include success flag, etc.) |
Usage Examples
Example 1: Basic GymWrapper Usage with Random Actions
import robosuite as suite
from robosuite.wrappers import GymWrapper
# Create robosuite environment
env = suite.make(
env_name="Lift",
robots="Panda",
has_renderer=False,
has_offscreen_renderer=False,
use_camera_obs=False,
)
# Wrap with GymWrapper
gym_env = GymWrapper(env)
# Print space information
print(f"Observation space: {gym_env.observation_space}")
print(f"Action space: {gym_env.action_space}")
# Run episode with random actions
obs, info = gym_env.reset(seed=42)
print(f"Initial observation shape: {obs.shape}")
for step in range(100):
# Sample random action
action = gym_env.action_space.sample()
# Execute action
obs, reward, terminated, truncated, info = gym_env.step(action)
# Check if episode ended
if terminated or truncated:
print(f"Episode ended at step {step}")
obs, info = gym_env.reset()
gym_env.close()
Example 2: Using with Standard RL Training Pattern
import numpy as np
import robosuite as suite
from robosuite.wrappers import GymWrapper
# Create and wrap environment
env = suite.make(
env_name="Stack",
robots="Panda",
has_renderer=False,
has_offscreen_renderer=False,
use_camera_obs=False,
horizon=200,
)
gym_env = GymWrapper(
env,
keys=["robot0_proprio-state", "object-state"],
flatten_obs=True
)
# Training loop compatible with any Gymnasium-based RL library
num_episodes = 10
for episode in range(num_episodes):
obs, info = gym_env.reset(seed=episode)
episode_reward = 0
episode_length = 0
done = False
while not done:
# In practice, replace with policy network
action = gym_env.action_space.sample()
# Standard Gymnasium step
obs, reward, terminated, truncated, info = gym_env.step(action)
episode_reward += reward
episode_length += 1
# Episode ends if terminated OR truncated
done = terminated or truncated
# Check for success (if provided in info)
if done and info.get("success", False):
print(f"Episode {episode}: SUCCESS!")
print(f"Episode {episode}: reward={episode_reward:.2f}, length={episode_length}")
gym_env.close()
Example 3: Integration with Stable-Baselines3
import robosuite as suite
from robosuite.wrappers import GymWrapper
from stable_baselines3 import PPO
from stable_baselines3.common.env_checker import check_env
# Create wrapped environment
env = suite.make(
env_name="Door",
robots="Panda",
has_renderer=False,
has_offscreen_renderer=False,
use_camera_obs=False,
)
gym_env = GymWrapper(env)
# Verify Gymnasium compatibility
check_env(gym_env)
# Train PPO agent (works seamlessly due to GymWrapper)
model = PPO("MlpPolicy", gym_env, verbose=1)
model.learn(total_timesteps=10000)
# Evaluate trained agent
obs, info = gym_env.reset()
for _ in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = gym_env.step(action)
if terminated or truncated:
obs, info = gym_env.reset()
gym_env.close()