Workflow:ARISE Initiative Robosuite Environment Setup And Simulation

Knowledge Sources	robosuite robosuite Docs robosuite White Paper
Domains	Robotics, Simulation, MuJoCo
Last Updated	2026-02-15 06:00 GMT

Overview

End-to-end process for creating a robosuite manipulation environment, selecting a robot and task, and running a simulation loop with random or scripted actions.

Description

This workflow covers the fundamental usage pattern of the robosuite framework: instantiating a MuJoCo-based robot manipulation environment using the factory pattern, configuring robot embodiments and controllers, resetting the simulation, and executing an action-observation loop. The process assembles a complete simulation scene by composing an Arena (workspace), Robot (embodiment with gripper), and task Objects into a unified MuJoCo model. It supports 13+ robot models, 11+ single-arm tasks, and 4 bimanual tasks out of the box.

Usage

Execute this workflow when you need to set up a robosuite simulation for the first time, evaluate a new robot-task combination, or create a basic simulation loop for testing controllers or policies. This is the starting point for all downstream workflows such as RL training, teleoperation, and demonstration collection.

Execution Steps

Step 1: Select Environment And Robot

Choose a manipulation task environment from the registered environments (e.g., Lift, Stack, Door, PickPlace, NutAssembly, Wipe, ToolHang, or bimanual tasks like TwoArmLift). Select one or more robot models (e.g., Panda, Sawyer, UR5e, IIWA, Jaco, Kinova3) to instantiate in the environment. For bimanual tasks, select either a single bimanual robot (Baxter) or two single-arm robots in parallel/opposed configuration.

Key considerations:

Use the interactive CLI helpers from `input_utils` for exploratory selection
Each environment registers itself via the `EnvMeta` metaclass into `REGISTERED_ENVS`
Bimanual environments require specifying `env_configuration` (bimanual, parallel, or opposed)

Step 2: Configure Controller

Load or specify the composite controller configuration for the selected robot. Each robot has a default JSON controller config that defines which composite controller type to use (BASIC, HYBRID_MOBILE_BASE, WHOLE_BODY_IK, or WHOLE_BODY_MINK_IK) and maps part controllers (arm, gripper, mobile base) to actuator groups.

Key considerations:

Controller configs are JSON files loaded via `load_composite_controller_config`
Part controllers include OSC_POSE, OSC_POSITION, JOINT_POSITION, JOINT_VELOCITY, JOINT_TORQUE, and IK_POSE
The composite controller orchestrates all part controllers for a given robot

Step 3: Create Environment

Instantiate the environment using the `robosuite.make()` factory function, passing the environment name, robot(s), controller configuration, and rendering options. This triggers the full scene assembly pipeline: Arena creation, Robot loading with gripper and base, Object placement, and MJCF model merging into a unified MuJoCo simulation.

Key considerations:

Set `has_renderer=True` for on-screen visualization, `has_offscreen_renderer=True` for pixel observations
Set `use_camera_obs` to enable or disable camera image observations
`control_freq` determines how many control steps per second (typically 20 Hz)
`reward_shaping` enables dense reward signals for RL training

Step 4: Reset And Run Simulation Loop

Call `env.reset()` to initialize the simulation state (randomized object placements, robot home position). Then execute the main loop: sample or compute actions, call `env.step(action)` to advance the simulation, and collect observations, rewards, and done signals. Optionally render each frame for visualization.

Key considerations:

`env.action_spec` returns (low, high) bounds for the action space
`env.step()` returns (observation, reward, done, info) following the Gym convention
Frame rate limiting ensures real-time playback when visualizing
`env.reset()` re-randomizes object placements each episode

Step 5: Inspect Observations And Close

Examine the observation dictionary returned by `env.step()` which includes robot proprioception (joint positions, velocities, end-effector pose), task-specific states (object positions, goal states), and optionally camera images. When finished, call `env.close()` to clean up MuJoCo resources and rendering contexts.

Key considerations:

Observations are returned as a flat dictionary of numpy arrays
Camera observations (if enabled) include RGB, depth, and segmentation maps
The Observable system supports configurable sensor corruption (noise, delay, filtering)

Execution Diagram

GitHub URL

Workflow Repository