Implementation:Haosulab ManiSkill VisualizationUtils
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, Visualization |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for creating videos from image sequences, tiling multiple images into grids, and overlaying text on images.
Description
The misc.py module in the visualization package provides image and video manipulation utilities for recording and displaying simulation outputs.
images_to_video():
- Converts a list of RGB image arrays (HxWx3) into an MP4 video using imageio/FFMPEG.
- Configurable FPS and quality (variable bitrate, 0-10 scale).
- Creates the output directory if it does not exist.
- Shows a tqdm progress bar during encoding when verbose is True.
tile_images():
- Combines multiple images into a single tiled image with configurable rows.
- Supports both regular images (H, W, C) and batched images (B, H, W, C).
- With nrows=1, images are sorted by height and arranged in columns.
- With nrows>1, all images must have the same dimensions.
- Works with both numpy arrays and torch tensors.
put_text_on_image():
- Renders text lines onto an image using PIL.
- Uses the Ubuntu Sans Mono font (shipped with the module).
- Text is green colored, positioned with 10px left margin.
put_info_on_image():
- Convenience wrapper that formats a dict of metrics as "key: value" lines and renders them on the image.
- Supports extra text lines appended after the metric lines.
Usage
Used for recording evaluation episodes, creating visualization videos, and overlaying debug information on rendered images. Commonly used in evaluation scripts and the ManiSkill viewer.
Code Reference
Source Location
- Repository: Haosulab_ManiSkill
- File: mani_skill/utils/visualization/misc.py
Signature
def images_to_video(
images: list[Array],
output_dir: str,
video_name: str,
fps: int = 10,
quality: Optional[float] = 5,
verbose: bool = True,
**kwargs,
) -> None: ...
def tile_images(images: list[Array], nrows=1) -> Array: ...
def put_text_on_image(image: np.ndarray, lines: list[str]) -> np.ndarray: ...
def put_info_on_image(image, info: dict[str, float], extras=None, overlay=True) -> np.ndarray: ...
Import
from mani_skill.utils.visualization.misc import images_to_video, tile_images, put_info_on_image
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | list[Array] | Yes | List of HxWx3 RGB images (uint8 or float) |
| output_dir | str | Yes | Directory to save the video |
| video_name | str | Yes | Name for the output video file |
| fps | int | No | Frames per second (default: 10) |
| nrows | int | No | Number of rows for tiling (default: 1) |
Outputs
| Name | Type | Description |
|---|---|---|
| (video file) | .mp4 | Written to output_dir/video_name.mp4 |
| tiled_image | Array | Single image combining all input images |
| annotated_image | np.ndarray | Image with text overlay |
Usage Examples
Basic Usage
from mani_skill.utils.visualization.misc import images_to_video, tile_images
# Record a video from rendered frames
frames = []
for _ in range(100):
obs, _, _, _, _ = env.step(action)
frame = env.render()
frames.append(frame)
images_to_video(frames, "output/", "episode", fps=30)
# Tile multiple camera views
tiled = tile_images([cam1_img, cam2_img, cam3_img], nrows=1)
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment