Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haosulab ManiSkill AssetData

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Asset Management
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for managing asset data source definitions, download URLs, and hierarchical grouping for ManiSkill environments.

Description

The data.py module defines the asset data registry that maps asset identifiers to their download sources and organizes them into groups for batch management.

DataSource dataclass: Represents a single downloadable asset:

  • source_type -- Category: "task_assets", "objects", "scene", or "robot".
  • url -- Direct download URL (HTTP/HTTPS).
  • hf_repo_id -- HuggingFace dataset repository ID (alternative to URL).
  • target_path -- Local path where the asset will be stored.
  • checksum -- SHA-256 checksum for verification.
  • output_dir -- Base output directory (defaults to ASSET_DIR).

Global registries:

  • DATA_SOURCES -- Dictionary mapping source IDs to DataSource objects.
  • DATA_GROUPS -- Dictionary mapping group IDs (often environment IDs) to lists of source IDs, supporting hierarchical grouping.

initialize_data_sources(): Populates both registries with all known assets including:

  • Task assets: YCB objects, assembling kits, obstacle configs, bridge_v2, OakInk-v2.
  • Scene datasets: ReplicaCAD, AI2THOR, RoboCasa.
  • Robots: UR10e, ANYmal C, Unitree H1/G1/Go2, Stompy, WidowX, Google Robot, Robotiq 2F, XArm6, and others.
  • PartNet-Mobility objects: Organized by category (cabinet, chair, bucket, faucet).

expand_data_group_into_individual_data_source_ids(): Recursively expands a data group into a flat list of individual data source IDs.

Usage

Used by the download_asset CLI tool and internally by environments to check whether required assets are downloaded. The module auto-initializes on import.

Code Reference

Source Location

Signature

@dataclass
class DataSource:
    source_type: str
    url: Optional[str] = None
    hf_repo_id: Optional[str] = None
    target_path: Optional[str] = None
    checksum: Optional[str] = None
    output_dir: str = ASSET_DIR

DATA_SOURCES: dict[str, DataSource]
DATA_GROUPS: dict[str, list[str]]

def is_data_source_downloaded(data_source_id: str) -> bool: ...
def initialize_data_sources() -> None: ...
def expand_data_group_into_individual_data_source_ids(data_group_id: str) -> list[str]: ...

Import

from mani_skill.utils.assets.data import DATA_SOURCES, DATA_GROUPS, DataSource

I/O Contract

Inputs

Name Type Required Description
data_source_id str Yes Identifier for a data source (for is_data_source_downloaded)
data_group_id str Yes Group identifier to expand (for expand_data_group...)

Outputs

Name Type Description
DATA_SOURCES dict[str, DataSource] Global registry of all data sources
DATA_GROUPS dict[str, list[str]] Global registry of data groups
uids list[str] Expanded list of individual data source IDs

Usage Examples

Basic Usage

from mani_skill.utils.assets.data import DATA_SOURCES, DATA_GROUPS

# Check available data sources
print(list(DATA_SOURCES.keys())[:5])  # ['ycb', 'pick_clutter_ycb_configs', ...]

# Check data groups
print(list(DATA_GROUPS.keys()))  # ['partnet_mobility_cabinet', ...]

# Check if an asset is downloaded
from mani_skill.utils.assets.data import is_data_source_downloaded
if not is_data_source_downloaded("ycb"):
    print("YCB assets need to be downloaded")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment