Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ARISE Initiative Robosuite TwoArmHandover Environment

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Manipulation
Last Updated 2026-02-15 07:00 GMT

Overview

Concrete tool for simulating a two-arm object handover manipulation task provided by robosuite.

Description

The TwoArmHandover class implements a handover task where a hammer must be transferred from one robot arm (Arm 0) to another (Arm 1). The task requires Arm 0 to grasp the hammer (either from the table or from its initial grip position), lift it, bring it close to Arm 1, and then Arm 1 must grasp the hammer handle while Arm 0 releases. The environment supports both two separate robots and a single bimanual robot in "opposed" or "parallel" configurations.

The prehensile parameter controls the initial state: when True, the hammer starts on the table and Arm 0 must first pick it up; when False, the hammer starts already in Arm 0's gripper, simplifying the initial phase. The reward function supports both sparse and dense modes. In sparse mode, a discrete reward of 2.0 is provided when the handover is completed (only Arm 1 grips the handle, Arm 0 has released, and the hammer is above a height threshold). In dense mode, six staged components are computed in a max-wise fashion: Arm0 reaching (in [0, 0.25]), Arm0 grasping (in {0, 0.5}), Arm0 lifting (in {0, 1.0}), Arm0 hovering toward Arm1 (in {0, [1.0, 1.25]}), mutual grasping (in {0, 1.5}), and handover completion (in {0, 2.0}). The final reward is scaled by reward_scale / 2.0.

Success is determined when Arm 1 is gripping the hammer handle, Arm 0 is not gripping any part of the hammer, and the hammer height exceeds the table height by the defined threshold (0.1 units).

Usage

Use this environment for training and evaluating bimanual coordination policies, especially those requiring inter-arm object transfer. This task tests reaching, grasping, lifting, coordination timing, and release behaviors.

Code Reference

Source Location

Signature

class TwoArmHandover(TwoArmEnv):
    def __init__(
        self,
        robots,
        env_configuration="default",
        controller_configs=None,
        gripper_types="default",
        initialization_noise="default",
        prehensile=True,
        table_full_size=(0.8, 1.2, 0.05),
        table_friction=(1.0, 5e-3, 1e-4),
        use_camera_obs=True,
        use_object_obs=True,
        reward_scale=1.0,
        reward_shaping=False,
        placement_initializer=None,
        has_renderer=False,
        has_offscreen_renderer=True,
        render_camera="frontview",
        render_collision_mesh=False,
        render_visual_mesh=True,
        render_gpu_device_id=-1,
        control_freq=20,
        lite_physics=True,
        horizon=1000,
        ignore_done=False,
        hard_reset=True,
        camera_names="agentview",
        camera_heights=256,
        camera_widths=256,
        camera_depths=False,
        camera_segmentations=None,
        renderer="mjviewer",
        renderer_config=None,
        seed=None,
    ):

Import

from robosuite.environments.manipulation.two_arm_handover import TwoArmHandover

I/O Contract

Inputs

Name Type Required Description
robots str or list of str Yes Either 2 single-arm robots or 1 bimanual robot
env_configuration str No "opposed" or "parallel" for two-robot setups. Default: "default" (maps to "opposed")
prehensile bool No If True, hammer starts on table; if False, starts in Arm 0's gripper. Default: True
table_full_size 3-tuple No (x, y, z) dimensions of the table. Default: (0.8, 1.2, 0.05)
reward_scale None or float No Scales the normalized reward. Default: 1.0
reward_shaping bool No If True, uses dense max-wise staged rewards. Default: False
placement_initializer ObjectPositionSampler No Custom placement sampler for the hammer. Default: UniformRandomSampler
horizon int No Episode length in timesteps. Default: 1000

Outputs

Name Type Description
hammer_pos np.array (3,) 3D position of the hammer body
hammer_quat np.array (4,) Quaternion orientation (xyzw) of the hammer body
handle_xpos np.array (3,) 3D position of the hammer handle
gripper{i}_to_handle np.array (3,) Vector from each gripper to the hammer handle
reward float Scalar reward value per step

Usage Examples

import robosuite as suite
import numpy as np

# Create a TwoArmHandover environment with two Panda robots
env = suite.make(
    env_name="TwoArmHandover",
    robots=["Panda", "Panda"],
    env_configuration="opposed",
    has_renderer=False,
    has_offscreen_renderer=False,
    use_camera_obs=False,
    use_object_obs=True,
    prehensile=True,
    reward_shaping=True,
    horizon=1000,
)

obs = env.reset()

for i in range(1000):
    action = np.random.randn(env.action_dim)
    obs, reward, done, info = env.step(action)
    if done:
        break

env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment