Implementation:ARISE Initiative Robosuite Stack Environment

Knowledge Sources	ARISE_Initiative_Robosuite
Domains	Robotics, Simulation, Manipulation
Last Updated	2026-02-15 07:00 GMT

Overview

Concrete tool for simulating a cube stacking manipulation task provided by robosuite.

Description

The Stack class implements a single-arm robot block stacking task on a tabletop workspace. The environment features two cubes: a smaller red cube (cubeA, 2cm sides) and a slightly larger green cube (cubeB, 2.5cm sides). The robot must pick up the red cube and stack it on top of the green cube. The cubes are initialized at random positions on the table surface using a uniform random sampler.

The reward function supports both sparse and dense modes. In sparse mode, a discrete reward of 2.0 is provided when the red cube is successfully stacked on the green cube (the red cube is touching the green cube, is lifted, and the gripper is not grasping it). In dense (shaped) mode, three staged reward components are computed and the maximum is taken: reaching + grasping (reaching in [0, 0.25] proportional to distance, grasping adds 0.25 when successful), lifting + aligning (lifting gives 1.0 when cubeA is above the table, aligning adds up to 0.5 based on horizontal distance between cubes), and stacking (2.0 when stacking conditions are met). The final reward is normalized by reward_scale / 2.0.

Success is determined by checking whether the stacking reward component is positive, which requires that cubeA is touching cubeB, cubeA has been lifted above the table, and the gripper is no longer grasping cubeA.

Usage

Use this environment for training and benchmarking manipulation policies on block stacking, which is one of the fundamental tasks in robotic manipulation research. The task combines reaching, grasping, lifting, and precise placement skills.

Code Reference

Source Location

Repository: ARISE_Initiative_Robosuite
File: robosuite/environments/manipulation/stack.py
Lines: 1-510

Signature

class Stack(ManipulationEnv):
    def __init__(
        self,
        robots,
        env_configuration="default",
        controller_configs=None,
        gripper_types="default",
        base_types="default",
        initialization_noise="default",
        table_full_size=(0.8, 0.8, 0.05),
        table_friction=(1.0, 5e-3, 1e-4),
        use_camera_obs=True,
        use_object_obs=True,
        reward_scale=1.0,
        reward_shaping=False,
        placement_initializer=None,
        has_renderer=False,
        has_offscreen_renderer=True,
        render_camera="frontview",
        render_collision_mesh=False,
        render_visual_mesh=True,
        render_gpu_device_id=-1,
        control_freq=20,
        lite_physics=True,
        horizon=1000,
        ignore_done=False,
        hard_reset=True,
        camera_names="agentview",
        camera_heights=256,
        camera_widths=256,
        camera_depths=False,
        camera_segmentations=None,
        renderer="mjviewer",
        renderer_config=None,
        seed=None,
    ):

Import

from robosuite.environments.manipulation.stack import Stack

I/O Contract

Inputs

Name	Type	Required	Description
robots	str or list of str	Yes	Specification for a single single-arm robot (e.g., "Panda")
table_full_size	3-tuple	No	(x, y, z) dimensions of the table. Default: (0.8, 0.8, 0.05)
table_friction	3-tuple	No	MuJoCo friction parameters for the table. Default: (1.0, 5e-3, 1e-4)
reward_scale	None or float	No	Scales the normalized reward. Default: 1.0
reward_shaping	bool	No	If True, uses dense staged rewards. Default: False
placement_initializer	ObjectPositionSampler	No	Custom placement sampler for cubes. Default: UniformRandomSampler
control_freq	float	No	Control signals per second. Default: 20
horizon	int	No	Episode length in timesteps. Default: 1000

Outputs

Name	Type	Description
cubeA_pos	np.array (3,)	3D position of the red cube (cubeA)
cubeA_quat	np.array (4,)	Quaternion orientation (xyzw) of cubeA
cubeB_pos	np.array (3,)	3D position of the green cube (cubeB)
cubeB_quat	np.array (4,)	Quaternion orientation (xyzw) of cubeB
cubeA_to_cubeB	np.array (3,)	Vector from cubeA to cubeB
{arm}gripper_to_cubeA	np.array (3,)	Vector from gripper to cubeA
{arm}gripper_to_cubeB	np.array (3,)	Vector from gripper to cubeB
reward	float	Scalar reward value per step

Usage Examples

import robosuite as suite
import numpy as np

# Create a Stack environment
env = suite.make(
    env_name="Stack",
    robots="Panda",
    has_renderer=False,
    has_offscreen_renderer=False,
    use_camera_obs=False,
    use_object_obs=True,
    reward_shaping=True,
    horizon=1000,
)

obs = env.reset()

for i in range(1000):
    action = np.random.randn(env.action_dim)
    obs, reward, done, info = env.step(action)
    if done:
        break

env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment