Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haosulab ManiSkill PushT

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Tabletop_Manipulation
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete implementation of the Push-T task environment in ManiSkill, where a robot must push a T-shaped block to a target pose.

Description

The PushTEnv places a T-shaped block on a white table and requires a robot with a stick end-effector to push the block to match a goal pose (both position and rotation). The environment uses a custom white table scene builder.

Registered as PushT-v1 with max_episode_steps=100. The supported robot is panda_stick (PandaStick). Reward modes include "normalized_dense", "dense", "sparse", and "none".

The T-block is built from two overlapping box shapes. Randomizations include the T-block's xy position and z-rotation, and the goal T pose. The goal is visualized with a green outline. Success is evaluated when the T-block position is within 0.01m and rotation is within a threshold of the goal pose.

Usage

Use this environment for non-prehensile manipulation research. The Push-T task is a well-known benchmark for testing pushing and planar manipulation policies, particularly with diffusion-based and behavior cloning approaches.

Code Reference

Source Location

Signature

@register_env("PushT-v1", max_episode_steps=100)
class PushTEnv(BaseEnv):
    SUPPORTED_ROBOTS: ["panda_stick"]
    agent: PandaStick

Import

import gymnasium as gym
import mani_skill.envs
env = gym.make("PushT-v1")

I/O Contract

Inputs

Name Type Required Description
obs_mode str No Observation mode
reward_mode str No Reward mode: "normalized_dense", "dense", "sparse", "none"
control_mode str No Control mode for panda_stick robot

Outputs

Name Type Description
obs dict/array Observation including T-block pose, goal pose, TCP pose
reward float Reward based on position and rotation alignment with goal
terminated bool Whether episode ended by success/failure
truncated bool Whether episode hit max steps (100)
info dict Contains success flag, position error, rotation error

Usage Examples

Basic Usage

import gymnasium as gym
import mani_skill.envs

env = gym.make("PushT-v1", obs_mode="state", render_mode="rgb_array")
obs, info = env.reset()
for _ in range(100):
    action = env.action_space.sample()
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()
env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment