Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haosulab ManiSkill ToleranceReward

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Reward Shaping
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for computing tolerance-based reward values that smoothly transition between 1 (in-bounds) and 0 (out-of-bounds) using configurable sigmoid functions.

Description

The tolerance() function implements a reward shaping utility adapted from DeepMind Control Suite's reward utilities. It returns 1.0 when the input x falls within the specified [lower, upper] bounds, and smoothly decays toward 0 as x moves outside the bounds, controlled by a margin parameter and choice of sigmoid function.

Supported sigmoid types:

  • gaussian -- Gaussian decay (default)
  • hyperbolic -- Logistic/hyperbolic sigmoid
  • quadratic -- Quadratic decay (allows value_at_margin == 0)
  • linear -- Linear decay (allows value_at_margin == 0)

Parameters:

  • x -- Input tensor of shape (B, 3) or any broadcastable shape.
  • lower / upper -- Inclusive target interval bounds (can be infinite for unbounded intervals, or equal for exact targets).
  • margin -- Controls the steepness of decay. If 0, output is binary (1 in bounds, 0 outside).
  • value_at_margin -- Output value when distance from nearest bound equals margin (default: 0.1).

The function operates entirely with PyTorch tensors, supporting batched computation for GPU-parallel environments.

Usage

Used in reward functions for ManiSkill tasks to create smooth, differentiable reward signals based on distance to target configurations. Common for manipulation tasks where rewards depend on gripper-to-object distance or joint position targets.

Code Reference

Source Location

Signature

def tolerance(
    x,
    lower=0.0,
    upper=0.0,
    margin=0.0,
    sigmoid="gaussian",
    value_at_margin=0.1,
) -> torch.Tensor: ...

Import

from mani_skill.envs.utils.rewards.common import tolerance

I/O Contract

Inputs

Name Type Required Description
x torch.Tensor Yes Input values to evaluate
lower float No Lower bound of target interval (default: 0.0)
upper float No Upper bound of target interval (default: 0.0)
margin float No Decay margin parameter (default: 0.0 for binary)
sigmoid str No Sigmoid type: "gaussian", "hyperbolic", "quadratic", "linear"
value_at_margin float No Output when distance equals margin (default: 0.1)

Outputs

Name Type Description
value torch.Tensor Reward values between 0.0 and 1.0, same shape as x

Usage Examples

Basic Usage

import torch
from mani_skill.envs.utils.rewards.common import tolerance

# Distance-based reward: 1.0 when distance < 0.01, smooth decay with margin 0.1
distance = torch.tensor([0.005, 0.05, 0.2])
reward = tolerance(distance, lower=0.0, upper=0.01, margin=0.1, sigmoid="gaussian")
# reward ~ [1.0, ~0.8, ~0.0]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment