Principle:ARISE Initiative Robosuite Two Arm Task Design

Knowledge Sources	ARISE_Initiative_Robosuite
Domains	Robotics, Reinforcement Learning, Bimanual Manipulation
Last Updated	2026-02-15 07:00 GMT

Overview

Two-arm task design defines a pattern for bimanual manipulation environments that support configurable arm arrangements, role assignment between arms, and coordination-dependent reward structures.

Description

Bimanual manipulation tasks involve two robot arms working together to accomplish goals that a single arm cannot achieve alone, such as handing objects between arms, lifting heavy or large objects cooperatively, inserting pegs into holes while one arm holds the fixture, or transporting objects between spatially separated locations. The two-arm task design principle extends the single-arm manipulation pattern with additional infrastructure for managing dual-arm configurations, inter-arm coordination, and role-based task decomposition.

The core architectural extension is support for multiple arm arrangement configurations. Two-arm tasks can be instantiated with either two separate single-arm robots or one bimanual robot. When two separate robots are used, they can be positioned in "opposed" configuration (facing each other across the table) or "parallel" configuration (side by side). When a single bimanual robot is used, the environment automatically configures for a "single-robot" arrangement. This flexibility is essential for studying how spatial arrangement affects bimanual coordination strategies.

The base two-arm environment provides helper methods for computing distances from each gripper (gripper0 and gripper1) to target objects, abstracting over the distinction between left/right grippers on a bimanual robot and grippers on separate robots. It also exposes end-effector poses and gripper states for both arms through a uniform interface. Concrete task classes build on this foundation to implement coordination-specific rewards, such as rewarding one arm for presenting an object in the correct orientation while the other arm grasps it.

Usage

Apply the two-arm task design pattern when creating bimanual manipulation environments that require coordination between two arms. Use the configurable arrangement system to study how robot placement affects task performance. Leverage the role-based gripper access (gripper0 and gripper1) to write reward functions that assign distinct roles to each arm without coupling to a specific robot morphology.

Theoretical Basis

Two-Arm Task Architecture:

  ManipulationEnv
    |
    TwoArmEnv                        (bimanual base class)
      |
      ConcreteTask                   (task-specific coordination logic)

  Configuration Modes:
    - "single-robot": One bimanual robot (e.g., Baxter)
        gripper0 = robot.gripper["right"]
        gripper1 = robot.gripper["left"]
    - "opposed": Two robots facing each other across table
        gripper0 = robots[0].gripper
        gripper1 = robots[1].gripper
    - "parallel": Two robots side by side
        gripper0 = robots[0].gripper
        gripper1 = robots[1].gripper

  Robot Configuration Validation:
    if len(robots) == 1:
        assert is_bimanual(robots[0])
        configuration = "single-robot"
    elif len(robots) == 2:
        configuration in {"opposed", "parallel"}
    else:
        raise error

  Coordination Reward Pattern (example: handover):
    Stage 1 - Arm0 reaches object:     r += reach_reward(gripper0, object)
    Stage 2 - Arm0 grasps object:      r += grasp_reward(gripper0, object)
    Stage 3 - Arm0 presents to Arm1:   r += present_reward(object, handover_zone)
    Stage 4 - Arm1 grasps from Arm0:   r += bimanual_grasp_reward(gripper1, object)
    Stage 5 - Arm0 releases:           r += release_reward(gripper0, object)

  Gripper-to-Target Utilities:
    _gripper0_to_target(target) -> distance vector or scalar
    _gripper1_to_target(target) -> distance vector or scalar
    _eef0_xpos, _eef1_xpos    -> end-effector positions

Key design decisions:

Configuration flexibility: Supporting single bimanual robots and pairs of single-arm robots through the same task code maximizes experimental versatility.
Role abstraction (gripper0/gripper1): Abstracting arm roles from physical configuration allows reward functions to be written once and applied across all arrangements.
Backward compatibility: Legacy configuration names ("single-arm-opposed", "single-arm-parallel") are automatically mapped to the current naming convention.
Coordination rewards: Multi-stage rewards that explicitly model the coordination sequence (reach, grasp, present, handover) provide learning signal for the challenging credit assignment problem in bimanual tasks.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment