Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ARISE Initiative Robomimic Custom Observation Modality Extension

From Leeroopedia
Knowledge Sources
Domains Robotics, Perception, Extensibility
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for extending the robomimic observation framework with custom sensor modalities, encoder networks, and data augmentation randomizers.

Description

This example module demonstrates the full pattern for adding a new observation modality to robomimic. It defines three extensibility points: CustomImageModality (a subclass of Modality that registers a new "custom_image" type with custom processing/unprocessing functions), CustomImageEncoderCore (a subclass of EncoderCore that defines how to encode observations of the custom modality), and CustomImageRandomizer (a subclass of Randomizer that provides data augmentation by creating noisy copies of input images and averaging the outputs). It also shows how to override the default processor on an existing modality (ScanModality).

Usage

Use this pattern when adapting robomimic to a custom robot setup with non-standard sensors (e.g., tactile arrays, thermal cameras, custom depth sensors). Import and subclass the base classes, then register the new modality in your config's observation encoder and modality sections.

Code Reference

Source Location

Signature

class CustomImageModality(Modality):
    """
    Custom modality for single-frame images with raw shape (H, W) in range [0, 255].
    """
    name = "custom_image"

    @classmethod
    def _default_obs_processor(cls, obs):
        """Normalize to [-1, 1] range."""

    @classmethod
    def _default_obs_unprocessor(cls, obs):
        """Reverse normalization back to [0, 255]."""


class CustomImageEncoderCore(EncoderCore):
    """
    Custom encoder core for processing custom image modality observations.
    """
    def __init__(self, input_shape, welcome_str):
        """
        Args:
            input_shape (tuple): shape of input, inferred automatically at runtime
            welcome_str (str): arbitrary custom argument
        """

    def output_shape(self, input_shape=None):
        """Returns output shape given input shape."""

    def forward(self, inputs):
        """Forward pass through the encoder."""


class CustomImageRandomizer(Randomizer):
    """
    Data augmentation randomizer that creates N noisy copies of each image
    and pools outputs by averaging.
    """
    def __init__(self, input_shape, num_rand=1, noise_scale=0.01):
        """
        Args:
            input_shape (tuple): shape of input (C, H, W)
            num_rand (int): number of random copies per input
            noise_scale (float): magnitude of uniform noise
        """

    def forward_in(self, inputs):
        """Create N noisy copies, reshape to (B*N, C, H, W)."""

    def forward_out(self, inputs):
        """Split (B*N, ...) -> (B, N, ...) and average across N."""

Import

from robomimic.models import EncoderCore, Randomizer
from robomimic.utils.obs_utils import Modality, ScanModality

I/O Contract

Inputs

Name Type Required Description
input_shape tuple Yes Shape of the observation tensor (without batch dimension)
welcome_str str Yes Custom argument for CustomImageEncoderCore (arbitrary kwargs)
num_rand int No Number of random augmented copies (default: 1)
noise_scale float No Magnitude of uniform noise for augmentation (default: 0.01)

Outputs

Name Type Description
CustomImageModality._default_obs_processor np.ndarray or torch.Tensor Normalized observation in [-1, 1] range
CustomImageEncoderCore.forward torch.Tensor Encoded observation (same shape as input for pass-through)
CustomImageRandomizer.forward_in torch.Tensor Augmented copies reshaped to (B*N, C, H, W)
CustomImageRandomizer.forward_out torch.Tensor Pooled output averaged across N copies, shape (B, ...)

Usage Examples

Defining a Custom Modality

from robomimic.utils.obs_utils import Modality

class CustomImageModality(Modality):
    name = "custom_image"

    @classmethod
    def _default_obs_processor(cls, obs):
        # Normalize from [0, 255] to [-1, 1]
        return (obs / 255.0 - 0.5) * 2

    @classmethod
    def _default_obs_unprocessor(cls, obs):
        # Reverse: from [-1, 1] back to [0, 255]
        return ((obs / 2) + 0.5) * 255.0

Overriding an Existing Modality Processor

import numpy as np
import torch
from robomimic.utils.obs_utils import ScanModality

def custom_scan_processor(obs):
    # Trim padded ends from scan data
    return obs[1:-1]

def custom_scan_unprocessor(obs):
    # Re-add padding
    if isinstance(obs, np.ndarray):
        return np.concatenate([np.zeros(1), obs, np.zeros(1)])
    return torch.concat([torch.zeros(1), obs, torch.zeros(1)])

ScanModality.set_obs_processor(processor=custom_scan_processor)
ScanModality.set_obs_unprocessor(unprocessor=custom_scan_unprocessor)

Registering Custom Modality in Config

from robomimic.config.bc_config import BCConfig

config = BCConfig()

# Set custom encoder for the new modality
config.observation.encoder.custom_image.core_class = "CustomImageEncoderCore"
config.observation.encoder.custom_image.core_kwargs.welcome_str = "hi there!"
config.observation.encoder.custom_image.obs_randomizer_class = "CustomImageRandomizer"
config.observation.encoder.custom_image.obs_randomizer_kwargs.num_rand = 3
config.observation.encoder.custom_image.obs_randomizer_kwargs.noise_scale = 0.05

# Associate observation keys with the custom modality
config.observation.modalities.obs.custom_image = ["my_image1", "my_image2"]
config.observation.modalities.goal.custom_image = ["my_image2", "my_image3"]

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment