Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ARISE Initiative Robosuite TwoArmPegInHole Environment

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Manipulation
Last Updated 2026-02-15 07:00 GMT

Overview

Concrete tool for simulating a two-arm peg-in-hole insertion manipulation task provided by robosuite.

Description

The TwoArmPegInHole class implements a bimanual peg insertion task where a cylindrical peg attached to one robot's end-effector must be aligned with and inserted into a plate with a hole attached to the other robot's end-effector. Unlike most other robosuite environments, this task does not use grippers; instead the peg and hole objects are directly mounted to the robots' end-effectors. The environment uses an EmptyArena (no table) since the task is performed entirely in free space.

The reward function supports both sparse and dense modes. In sparse mode, a discrete reward of 5.0 is provided when the peg is correctly inserted into the hole (perpendicular distance d < 0.06, parallel distance t in [-0.12, 0.14], and alignment cos(theta) > 0.95). In dense (shaped) mode, four components are summed: reaching (in [0, 1], based on distance between peg and hole bodies), perpendicular distance reward (in [0, 1], based on the distance d from the peg axis to the hole), parallel distance reward (in [0, 1], based on the signed projection t along the peg axis), and alignment reward (in [0, 1], the cosine of the angle between the peg and hole normals). When not using shaped rewards, the sparse reward is scaled to 5.0. The final reward is normalized by reward_scale / 5.0.

The peg radius is uniformly sampled between configurable bounds (default [0.015, 0.03]) and has a configurable length (default 0.13). The environment includes robot-type-specific offset functions for certain humanoid robots (GR1, G1) where the z-direction is flipped. The _compute_orientation helper method computes the intersection of the peg's axis line with the hole's plane to produce the parallel distance, perpendicular distance, and alignment angle used in both reward computation and success checking.

Usage

Use this environment for training and evaluating bimanual coordination policies that require precise spatial alignment and insertion behaviors. The peg-in-hole task is a classic benchmark for studying contact-rich manipulation and fine position/orientation control with two arms.

Code Reference

Source Location

Signature

class TwoArmPegInHole(TwoArmEnv):
    def __init__(
        self,
        robots,
        env_configuration="default",
        controller_configs=None,
        gripper_types=None,
        initialization_noise="default",
        use_camera_obs=True,
        use_object_obs=True,
        reward_scale=1.0,
        reward_shaping=False,
        peg_radius=(0.015, 0.03),
        peg_length=0.13,
        has_renderer=False,
        has_offscreen_renderer=True,
        render_camera="frontview",
        render_collision_mesh=False,
        render_visual_mesh=True,
        render_gpu_device_id=-1,
        control_freq=20,
        lite_physics=True,
        horizon=1000,
        ignore_done=False,
        hard_reset=True,
        camera_names="agentview",
        camera_heights=256,
        camera_widths=256,
        camera_depths=False,
        camera_segmentations=None,
        renderer="mjviewer",
        renderer_config=None,
        seed=None,
    ):

Import

from robosuite.environments.manipulation.two_arm_peg_in_hole import TwoArmPegInHole

I/O Contract

Inputs

Name Type Required Description
robots str or list of str Yes Either 2 single-arm robots or 1 bimanual robot
env_configuration str No "opposed" or "parallel" for two-robot setups. Default: "default" (maps to "opposed")
gripper_types None No Must be None; grippers are not used in this environment
peg_radius 2-tuple No (min, max) for uniformly sampled peg radius. Default: (0.015, 0.03)
peg_length float No Length of the peg cylinder. Default: 0.13
reward_scale None or float No Scales the normalized reward. Default: 1.0
reward_shaping bool No If True, uses dense summed rewards. Default: False
horizon int No Episode length in timesteps. Default: 1000

Outputs

Name Type Description
hole_pos np.array (3,) 3D position of the hole body
hole_quat np.array (4,) Quaternion orientation (xyzw) of the hole body
peg_to_hole np.array (3,) Vector from peg body to hole body
peg_quat np.array (4,) Quaternion orientation (xyzw) of the peg body
angle float Cosine of angle between peg axis and hole normal
t float Parallel (signed projection) distance between peg and hole
d float Perpendicular distance between peg axis and hole center
reward float Scalar reward value per step

Usage Examples

import robosuite as suite
import numpy as np

# Create a TwoArmPegInHole environment with two Panda robots
env = suite.make(
    env_name="TwoArmPegInHole",
    robots=["Panda", "Panda"],
    env_configuration="opposed",
    has_renderer=False,
    has_offscreen_renderer=False,
    use_camera_obs=False,
    use_object_obs=True,
    reward_shaping=True,
    peg_radius=(0.015, 0.03),
    peg_length=0.13,
    horizon=1000,
)

obs = env.reset()

for i in range(1000):
    action = np.random.randn(env.action_dim)
    obs, reward, done, info = env.step(action)
    if done:
        break

env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment