Workflow:Obss Sahi COCO Dataset Slicing
| Knowledge Sources | |
|---|---|
| Domains | Computer_Vision, Data_Engineering, Object_Detection |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
End-to-end process for slicing a COCO-annotated dataset of large images into smaller tiles with properly adjusted annotations, producing a new COCO dataset suitable for training small-object detection models.
Description
This workflow takes a COCO-format dataset (images plus annotation JSON) and systematically slices each image into overlapping tiles. For each tile, the corresponding annotations are clipped to the tile boundaries and filtered by a minimum area ratio to discard annotations that become too small after cropping. The output is a new set of sliced images and a new COCO annotation JSON file where all coordinates are relative to the individual tiles. This is essential for training detection models on datasets where objects are small relative to the image resolution.
Usage
Execute this workflow when you have a COCO-annotated dataset of high-resolution images and need to prepare training data for small-object detection. Typical scenarios include satellite or aerial imagery datasets, microscopy datasets, or any collection where the training images are significantly larger than the model's input resolution. The input is a COCO annotation JSON file and an image directory. The output is a directory of sliced images and a corresponding COCO annotation JSON file.
Execution Steps
Step 1: Load COCO Dataset
Parse the COCO annotation JSON file and construct an internal representation of the dataset. Each image entry is mapped to its associated annotations (bounding boxes, segmentation masks, category IDs). The Coco class provides the structured access needed to iterate over images and their annotations.
Key considerations:
- The COCO JSON must follow standard COCO format with images, annotations, and categories sections
- Both bounding box and segmentation polygon annotations are supported
- The image directory path is separate from the annotation file and must contain all referenced images
Step 2: Configure Slicing Parameters
Define the tile dimensions (slice height and width), overlap ratios, and filtering thresholds. The overlap ratio ensures objects near tile boundaries are captured in at least one tile. The minimum area ratio threshold filters out annotations that become too small after clipping to a tile boundary.
Key considerations:
- Default slice size is 512x512 pixels with 0.2 overlap ratio
- Multiple slice sizes can be specified to generate datasets at different scales
- Minimum area ratio of 0.1 (default) removes annotations cropped to less than 10% of their original area
- Output image format can be specified (default exports as JPG)
Step 3: Slice Images and Annotations
Iterate over each image in the dataset. For each image, calculate the tile grid based on the configured slice parameters. Extract each tile as a separate image array. For each annotation belonging to the source image, check whether it intersects with the current tile. If it does, clip the annotation geometry to the tile boundaries and compute the area ratio. Annotations passing the area ratio threshold are added to the tile's annotation list with coordinates adjusted to be tile-relative.
Key considerations:
- Tiles at image edges are adjusted to prevent exceeding image boundaries
- Polygon segmentation annotations are clipped using Shapely geometry operations
- Annotations with topological errors are skipped with a warning
- Negative samples (tiles without annotations) are included by default but can be optionally excluded
Step 4: Export Sliced Dataset
Write the sliced tile images to the output directory. Construct a new COCO annotation dictionary with the sliced images and their adjusted annotations. Each tile image gets a unique filename derived from the original image name plus the tile coordinates. Save the assembled COCO JSON to the output directory.
Key considerations:
- Output filenames encode the tile coordinates for traceability (e.g., original_0_0_512_512.png)
- The output COCO JSON preserves the original category definitions
- Image export uses multi-threaded I/O for performance
- The exported dataset is immediately usable for training with any COCO-compatible training framework