Workflow:Obss Sahi COCO Dataset Slicing

Knowledge Sources	SAHI SAHI Slicing Docs SAHI COCO Docs
Domains	Computer_Vision, Data_Engineering, Object_Detection
Last Updated	2026-02-08 12:00 GMT

Overview

End-to-end process for slicing a COCO-annotated dataset of large images into smaller tiles with properly adjusted annotations, producing a new COCO dataset suitable for training small-object detection models.

Description

This workflow takes a COCO-format dataset (images plus annotation JSON) and systematically slices each image into overlapping tiles. For each tile, the corresponding annotations are clipped to the tile boundaries and filtered by a minimum area ratio to discard annotations that become too small after cropping. The output is a new set of sliced images and a new COCO annotation JSON file where all coordinates are relative to the individual tiles. This is essential for training detection models on datasets where objects are small relative to the image resolution.

Usage

Execute this workflow when you have a COCO-annotated dataset of high-resolution images and need to prepare training data for small-object detection. Typical scenarios include satellite or aerial imagery datasets, microscopy datasets, or any collection where the training images are significantly larger than the model's input resolution. The input is a COCO annotation JSON file and an image directory. The output is a directory of sliced images and a corresponding COCO annotation JSON file.

Execution Steps

Step 1: Load COCO Dataset

Parse the COCO annotation JSON file and construct an internal representation of the dataset. Each image entry is mapped to its associated annotations (bounding boxes, segmentation masks, category IDs). The Coco class provides the structured access needed to iterate over images and their annotations.

Key considerations:

The COCO JSON must follow standard COCO format with images, annotations, and categories sections
Both bounding box and segmentation polygon annotations are supported
The image directory path is separate from the annotation file and must contain all referenced images

Step 2: Configure Slicing Parameters

Define the tile dimensions (slice height and width), overlap ratios, and filtering thresholds. The overlap ratio ensures objects near tile boundaries are captured in at least one tile. The minimum area ratio threshold filters out annotations that become too small after clipping to a tile boundary.

Key considerations:

Default slice size is 512x512 pixels with 0.2 overlap ratio
Multiple slice sizes can be specified to generate datasets at different scales
Minimum area ratio of 0.1 (default) removes annotations cropped to less than 10% of their original area
Output image format can be specified (default exports as JPG)

Step 3: Slice Images and Annotations

Iterate over each image in the dataset. For each image, calculate the tile grid based on the configured slice parameters. Extract each tile as a separate image array. For each annotation belonging to the source image, check whether it intersects with the current tile. If it does, clip the annotation geometry to the tile boundaries and compute the area ratio. Annotations passing the area ratio threshold are added to the tile's annotation list with coordinates adjusted to be tile-relative.

Key considerations:

Tiles at image edges are adjusted to prevent exceeding image boundaries
Polygon segmentation annotations are clipped using Shapely geometry operations
Annotations with topological errors are skipped with a warning
Negative samples (tiles without annotations) are included by default but can be optionally excluded

Step 4: Export Sliced Dataset

Write the sliced tile images to the output directory. Construct a new COCO annotation dictionary with the sliced images and their adjusted annotations. Each tile image gets a unique filename derived from the original image name plus the tile coordinates. Save the assembled COCO JSON to the output directory.

Key considerations:

Output filenames encode the tile coordinates for traceability (e.g., original_0_0_512_512.png)
The output COCO JSON preserves the original category definitions
Image export uses multi-threaded I/O for performance
The exported dataset is immediately usable for training with any COCO-compatible training framework

Execution Diagram

GitHub URL

Workflow Repository