Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haosulab ManiSkill VisualizationUtils

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Visualization
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for creating videos from image sequences, tiling multiple images into grids, and overlaying text on images.

Description

The misc.py module in the visualization package provides image and video manipulation utilities for recording and displaying simulation outputs.

images_to_video():

  • Converts a list of RGB image arrays (HxWx3) into an MP4 video using imageio/FFMPEG.
  • Configurable FPS and quality (variable bitrate, 0-10 scale).
  • Creates the output directory if it does not exist.
  • Shows a tqdm progress bar during encoding when verbose is True.

tile_images():

  • Combines multiple images into a single tiled image with configurable rows.
  • Supports both regular images (H, W, C) and batched images (B, H, W, C).
  • With nrows=1, images are sorted by height and arranged in columns.
  • With nrows>1, all images must have the same dimensions.
  • Works with both numpy arrays and torch tensors.

put_text_on_image():

  • Renders text lines onto an image using PIL.
  • Uses the Ubuntu Sans Mono font (shipped with the module).
  • Text is green colored, positioned with 10px left margin.

put_info_on_image():

  • Convenience wrapper that formats a dict of metrics as "key: value" lines and renders them on the image.
  • Supports extra text lines appended after the metric lines.

Usage

Used for recording evaluation episodes, creating visualization videos, and overlaying debug information on rendered images. Commonly used in evaluation scripts and the ManiSkill viewer.

Code Reference

Source Location

Signature

def images_to_video(
    images: list[Array],
    output_dir: str,
    video_name: str,
    fps: int = 10,
    quality: Optional[float] = 5,
    verbose: bool = True,
    **kwargs,
) -> None: ...

def tile_images(images: list[Array], nrows=1) -> Array: ...
def put_text_on_image(image: np.ndarray, lines: list[str]) -> np.ndarray: ...
def put_info_on_image(image, info: dict[str, float], extras=None, overlay=True) -> np.ndarray: ...

Import

from mani_skill.utils.visualization.misc import images_to_video, tile_images, put_info_on_image

I/O Contract

Inputs

Name Type Required Description
images list[Array] Yes List of HxWx3 RGB images (uint8 or float)
output_dir str Yes Directory to save the video
video_name str Yes Name for the output video file
fps int No Frames per second (default: 10)
nrows int No Number of rows for tiling (default: 1)

Outputs

Name Type Description
(video file) .mp4 Written to output_dir/video_name.mp4
tiled_image Array Single image combining all input images
annotated_image np.ndarray Image with text overlay

Usage Examples

Basic Usage

from mani_skill.utils.visualization.misc import images_to_video, tile_images

# Record a video from rendered frames
frames = []
for _ in range(100):
    obs, _, _, _, _ = env.step(action)
    frame = env.render()
    frames.append(frame)
images_to_video(frames, "output/", "episode", fps=30)

# Tile multiple camera views
tiled = tile_images([cam1_img, cam2_img, cam3_img], nrows=1)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment