Implementation:Haosulab ManiSkill AssetData
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, Asset Management |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for managing asset data source definitions, download URLs, and hierarchical grouping for ManiSkill environments.
Description
The data.py module defines the asset data registry that maps asset identifiers to their download sources and organizes them into groups for batch management.
DataSource dataclass: Represents a single downloadable asset:
source_type-- Category: "task_assets", "objects", "scene", or "robot".url-- Direct download URL (HTTP/HTTPS).hf_repo_id-- HuggingFace dataset repository ID (alternative to URL).target_path-- Local path where the asset will be stored.checksum-- SHA-256 checksum for verification.output_dir-- Base output directory (defaults to ASSET_DIR).
Global registries:
DATA_SOURCES-- Dictionary mapping source IDs to DataSource objects.DATA_GROUPS-- Dictionary mapping group IDs (often environment IDs) to lists of source IDs, supporting hierarchical grouping.
initialize_data_sources(): Populates both registries with all known assets including:
- Task assets: YCB objects, assembling kits, obstacle configs, bridge_v2, OakInk-v2.
- Scene datasets: ReplicaCAD, AI2THOR, RoboCasa.
- Robots: UR10e, ANYmal C, Unitree H1/G1/Go2, Stompy, WidowX, Google Robot, Robotiq 2F, XArm6, and others.
- PartNet-Mobility objects: Organized by category (cabinet, chair, bucket, faucet).
expand_data_group_into_individual_data_source_ids(): Recursively expands a data group into a flat list of individual data source IDs.
Usage
Used by the download_asset CLI tool and internally by environments to check whether required assets are downloaded. The module auto-initializes on import.
Code Reference
Source Location
- Repository: Haosulab_ManiSkill
- File: mani_skill/utils/assets/data.py
Signature
@dataclass
class DataSource:
source_type: str
url: Optional[str] = None
hf_repo_id: Optional[str] = None
target_path: Optional[str] = None
checksum: Optional[str] = None
output_dir: str = ASSET_DIR
DATA_SOURCES: dict[str, DataSource]
DATA_GROUPS: dict[str, list[str]]
def is_data_source_downloaded(data_source_id: str) -> bool: ...
def initialize_data_sources() -> None: ...
def expand_data_group_into_individual_data_source_ids(data_group_id: str) -> list[str]: ...
Import
from mani_skill.utils.assets.data import DATA_SOURCES, DATA_GROUPS, DataSource
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data_source_id | str | Yes | Identifier for a data source (for is_data_source_downloaded) |
| data_group_id | str | Yes | Group identifier to expand (for expand_data_group...) |
Outputs
| Name | Type | Description |
|---|---|---|
| DATA_SOURCES | dict[str, DataSource] | Global registry of all data sources |
| DATA_GROUPS | dict[str, list[str]] | Global registry of data groups |
| uids | list[str] | Expanded list of individual data source IDs |
Usage Examples
Basic Usage
from mani_skill.utils.assets.data import DATA_SOURCES, DATA_GROUPS
# Check available data sources
print(list(DATA_SOURCES.keys())[:5]) # ['ycb', 'pick_clutter_ycb_configs', ...]
# Check data groups
print(list(DATA_GROUPS.keys())) # ['partnet_mobility_cabinet', ...]
# Check if an asset is downloaded
from mani_skill.utils.assets.data import is_data_source_downloaded
if not is_data_source_downloaded("ycb"):
print("YCB assets need to be downloaded")