Workflow:ARISE Initiative Robosuite Environment Setup And Simulation
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, MuJoCo |
| Last Updated | 2026-02-15 06:00 GMT |
Overview
End-to-end process for creating a robosuite manipulation environment, selecting a robot and task, and running a simulation loop with random or scripted actions.
Description
This workflow covers the fundamental usage pattern of the robosuite framework: instantiating a MuJoCo-based robot manipulation environment using the factory pattern, configuring robot embodiments and controllers, resetting the simulation, and executing an action-observation loop. The process assembles a complete simulation scene by composing an Arena (workspace), Robot (embodiment with gripper), and task Objects into a unified MuJoCo model. It supports 13+ robot models, 11+ single-arm tasks, and 4 bimanual tasks out of the box.
Usage
Execute this workflow when you need to set up a robosuite simulation for the first time, evaluate a new robot-task combination, or create a basic simulation loop for testing controllers or policies. This is the starting point for all downstream workflows such as RL training, teleoperation, and demonstration collection.
Execution Steps
Step 1: Select Environment And Robot
Choose a manipulation task environment from the registered environments (e.g., Lift, Stack, Door, PickPlace, NutAssembly, Wipe, ToolHang, or bimanual tasks like TwoArmLift). Select one or more robot models (e.g., Panda, Sawyer, UR5e, IIWA, Jaco, Kinova3) to instantiate in the environment. For bimanual tasks, select either a single bimanual robot (Baxter) or two single-arm robots in parallel/opposed configuration.
Key considerations:
- Use the interactive CLI helpers from `input_utils` for exploratory selection
- Each environment registers itself via the `EnvMeta` metaclass into `REGISTERED_ENVS`
- Bimanual environments require specifying `env_configuration` (bimanual, parallel, or opposed)
Step 2: Configure Controller
Load or specify the composite controller configuration for the selected robot. Each robot has a default JSON controller config that defines which composite controller type to use (BASIC, HYBRID_MOBILE_BASE, WHOLE_BODY_IK, or WHOLE_BODY_MINK_IK) and maps part controllers (arm, gripper, mobile base) to actuator groups.
Key considerations:
- Controller configs are JSON files loaded via `load_composite_controller_config`
- Part controllers include OSC_POSE, OSC_POSITION, JOINT_POSITION, JOINT_VELOCITY, JOINT_TORQUE, and IK_POSE
- The composite controller orchestrates all part controllers for a given robot
Step 3: Create Environment
Instantiate the environment using the `robosuite.make()` factory function, passing the environment name, robot(s), controller configuration, and rendering options. This triggers the full scene assembly pipeline: Arena creation, Robot loading with gripper and base, Object placement, and MJCF model merging into a unified MuJoCo simulation.
Key considerations:
- Set `has_renderer=True` for on-screen visualization, `has_offscreen_renderer=True` for pixel observations
- Set `use_camera_obs` to enable or disable camera image observations
- `control_freq` determines how many control steps per second (typically 20 Hz)
- `reward_shaping` enables dense reward signals for RL training
Step 4: Reset And Run Simulation Loop
Call `env.reset()` to initialize the simulation state (randomized object placements, robot home position). Then execute the main loop: sample or compute actions, call `env.step(action)` to advance the simulation, and collect observations, rewards, and done signals. Optionally render each frame for visualization.
Key considerations:
- `env.action_spec` returns (low, high) bounds for the action space
- `env.step()` returns (observation, reward, done, info) following the Gym convention
- Frame rate limiting ensures real-time playback when visualizing
- `env.reset()` re-randomizes object placements each episode
Step 5: Inspect Observations And Close
Examine the observation dictionary returned by `env.step()` which includes robot proprioception (joint positions, velocities, end-effector pose), task-specific states (object positions, goal states), and optionally camera images. When finished, call `env.close()` to clean up MuJoCo resources and rendering contexts.
Key considerations:
- Observations are returned as a flat dictionary of numpy arrays
- Camera observations (if enabled) include RGB, depth, and segmentation maps
- The Observable system supports configurable sensor corruption (noise, delay, filtering)