Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Obss Sahi COCO Dataset Slicing

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Data_Engineering, Object_Detection
Last Updated 2026-02-08 12:00 GMT

Overview

End-to-end process for slicing a COCO-annotated dataset of large images into smaller tiles with properly adjusted annotations, producing a new COCO dataset suitable for training small-object detection models.

Description

This workflow takes a COCO-format dataset (images plus annotation JSON) and systematically slices each image into overlapping tiles. For each tile, the corresponding annotations are clipped to the tile boundaries and filtered by a minimum area ratio to discard annotations that become too small after cropping. The output is a new set of sliced images and a new COCO annotation JSON file where all coordinates are relative to the individual tiles. This is essential for training detection models on datasets where objects are small relative to the image resolution.

Usage

Execute this workflow when you have a COCO-annotated dataset of high-resolution images and need to prepare training data for small-object detection. Typical scenarios include satellite or aerial imagery datasets, microscopy datasets, or any collection where the training images are significantly larger than the model's input resolution. The input is a COCO annotation JSON file and an image directory. The output is a directory of sliced images and a corresponding COCO annotation JSON file.

Execution Steps

Step 1: Load COCO Dataset

Parse the COCO annotation JSON file and construct an internal representation of the dataset. Each image entry is mapped to its associated annotations (bounding boxes, segmentation masks, category IDs). The Coco class provides the structured access needed to iterate over images and their annotations.

Key considerations:

  • The COCO JSON must follow standard COCO format with images, annotations, and categories sections
  • Both bounding box and segmentation polygon annotations are supported
  • The image directory path is separate from the annotation file and must contain all referenced images

Step 2: Configure Slicing Parameters

Define the tile dimensions (slice height and width), overlap ratios, and filtering thresholds. The overlap ratio ensures objects near tile boundaries are captured in at least one tile. The minimum area ratio threshold filters out annotations that become too small after clipping to a tile boundary.

Key considerations:

  • Default slice size is 512x512 pixels with 0.2 overlap ratio
  • Multiple slice sizes can be specified to generate datasets at different scales
  • Minimum area ratio of 0.1 (default) removes annotations cropped to less than 10% of their original area
  • Output image format can be specified (default exports as JPG)

Step 3: Slice Images and Annotations

Iterate over each image in the dataset. For each image, calculate the tile grid based on the configured slice parameters. Extract each tile as a separate image array. For each annotation belonging to the source image, check whether it intersects with the current tile. If it does, clip the annotation geometry to the tile boundaries and compute the area ratio. Annotations passing the area ratio threshold are added to the tile's annotation list with coordinates adjusted to be tile-relative.

Key considerations:

  • Tiles at image edges are adjusted to prevent exceeding image boundaries
  • Polygon segmentation annotations are clipped using Shapely geometry operations
  • Annotations with topological errors are skipped with a warning
  • Negative samples (tiles without annotations) are included by default but can be optionally excluded

Step 4: Export Sliced Dataset

Write the sliced tile images to the output directory. Construct a new COCO annotation dictionary with the sliced images and their adjusted annotations. Each tile image gets a unique filename derived from the original image name plus the tile coordinates. Save the assembled COCO JSON to the output directory.

Key considerations:

  • Output filenames encode the tile coordinates for traceability (e.g., original_0_0_512_512.png)
  • The output COCO JSON preserves the original category definitions
  • Image export uses multi-threaded I/O for performance
  • The exported dataset is immediately usable for training with any COCO-compatible training framework

Execution Diagram

GitHub URL

Workflow Repository