Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ARISE Initiative Robosuite Stack Environment

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Manipulation
Last Updated 2026-02-15 07:00 GMT

Overview

Concrete tool for simulating a cube stacking manipulation task provided by robosuite.

Description

The Stack class implements a single-arm robot block stacking task on a tabletop workspace. The environment features two cubes: a smaller red cube (cubeA, 2cm sides) and a slightly larger green cube (cubeB, 2.5cm sides). The robot must pick up the red cube and stack it on top of the green cube. The cubes are initialized at random positions on the table surface using a uniform random sampler.

The reward function supports both sparse and dense modes. In sparse mode, a discrete reward of 2.0 is provided when the red cube is successfully stacked on the green cube (the red cube is touching the green cube, is lifted, and the gripper is not grasping it). In dense (shaped) mode, three staged reward components are computed and the maximum is taken: reaching + grasping (reaching in [0, 0.25] proportional to distance, grasping adds 0.25 when successful), lifting + aligning (lifting gives 1.0 when cubeA is above the table, aligning adds up to 0.5 based on horizontal distance between cubes), and stacking (2.0 when stacking conditions are met). The final reward is normalized by reward_scale / 2.0.

Success is determined by checking whether the stacking reward component is positive, which requires that cubeA is touching cubeB, cubeA has been lifted above the table, and the gripper is no longer grasping cubeA.

Usage

Use this environment for training and benchmarking manipulation policies on block stacking, which is one of the fundamental tasks in robotic manipulation research. The task combines reaching, grasping, lifting, and precise placement skills.

Code Reference

Source Location

Signature

class Stack(ManipulationEnv):
    def __init__(
        self,
        robots,
        env_configuration="default",
        controller_configs=None,
        gripper_types="default",
        base_types="default",
        initialization_noise="default",
        table_full_size=(0.8, 0.8, 0.05),
        table_friction=(1.0, 5e-3, 1e-4),
        use_camera_obs=True,
        use_object_obs=True,
        reward_scale=1.0,
        reward_shaping=False,
        placement_initializer=None,
        has_renderer=False,
        has_offscreen_renderer=True,
        render_camera="frontview",
        render_collision_mesh=False,
        render_visual_mesh=True,
        render_gpu_device_id=-1,
        control_freq=20,
        lite_physics=True,
        horizon=1000,
        ignore_done=False,
        hard_reset=True,
        camera_names="agentview",
        camera_heights=256,
        camera_widths=256,
        camera_depths=False,
        camera_segmentations=None,
        renderer="mjviewer",
        renderer_config=None,
        seed=None,
    ):

Import

from robosuite.environments.manipulation.stack import Stack

I/O Contract

Inputs

Name Type Required Description
robots str or list of str Yes Specification for a single single-arm robot (e.g., "Panda")
table_full_size 3-tuple No (x, y, z) dimensions of the table. Default: (0.8, 0.8, 0.05)
table_friction 3-tuple No MuJoCo friction parameters for the table. Default: (1.0, 5e-3, 1e-4)
reward_scale None or float No Scales the normalized reward. Default: 1.0
reward_shaping bool No If True, uses dense staged rewards. Default: False
placement_initializer ObjectPositionSampler No Custom placement sampler for cubes. Default: UniformRandomSampler
control_freq float No Control signals per second. Default: 20
horizon int No Episode length in timesteps. Default: 1000

Outputs

Name Type Description
cubeA_pos np.array (3,) 3D position of the red cube (cubeA)
cubeA_quat np.array (4,) Quaternion orientation (xyzw) of cubeA
cubeB_pos np.array (3,) 3D position of the green cube (cubeB)
cubeB_quat np.array (4,) Quaternion orientation (xyzw) of cubeB
cubeA_to_cubeB np.array (3,) Vector from cubeA to cubeB
{arm}gripper_to_cubeA np.array (3,) Vector from gripper to cubeA
{arm}gripper_to_cubeB np.array (3,) Vector from gripper to cubeB
reward float Scalar reward value per step

Usage Examples

import robosuite as suite
import numpy as np

# Create a Stack environment
env = suite.make(
    env_name="Stack",
    robots="Panda",
    has_renderer=False,
    has_offscreen_renderer=False,
    use_camera_obs=False,
    use_object_obs=True,
    reward_shaping=True,
    horizon=1000,
)

obs = env.reset()

for i in range(1000):
    action = np.random.randn(env.action_dim)
    obs, reward, done, info = env.step(action)
    if done:
        break

env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment