Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Obss Sahi COCO Dataset Slicing

From Leeroopedia


Knowledge Sources
Domains Object_Detection, Data_Engineering, Image_Processing
Last Updated 2026-02-08 12:00 GMT

Overview

A dataset preparation technique that tiles COCO-annotated images and their corresponding annotations into smaller patches, producing a new COCO dataset optimized for training small object detection models.

Description

While image slicing during inference improves small object detection at test time, training models on sliced data can further improve performance. COCO dataset slicing applies the same tiling strategy to the training set, producing a new dataset where:

  • Each original image is divided into overlapping tiles
  • Annotations (bounding boxes and segmentation masks) are clipped to each tile boundary
  • Annotations that fall below a minimum area ratio after clipping are discarded
  • The sliced images and adjusted annotations are exported as a new COCO-format dataset

This principle ensures that the model trains on the same data distribution it will see during sliced inference, improving detection accuracy for small objects.

Key challenges addressed:

  • Annotation clipping: Bounding boxes and segmentation polygons must be geometrically intersected with the slice boundary
  • Area filtering: Heavily clipped annotations (where most of the object falls outside the slice) are removed to avoid training on incomplete objects
  • Coordinate adjustment: Annotation coordinates are transformed from full-image space to slice-local space
  • Multi-scale support: Multiple slice sizes can be specified to create a multi-scale training dataset

Usage

Use COCO dataset slicing when preparing training data for small object detection tasks. This is typically done offline as a preprocessing step before training. Common use cases include:

  • Aerial/satellite imagery where objects are very small
  • Surveillance datasets with distant subjects
  • Any dataset where the target objects are small relative to image resolution

Theoretical Basis

The annotation slicing process for each tile:

# Pseudocode for annotation slicing
def slice_annotations(annotations, slice_bbox, min_area_ratio):
    sliced = []
    for ann in annotations:
        if not overlaps(ann.bbox, slice_bbox):
            continue
        clipped_ann = clip_to_boundary(ann, slice_bbox)
        if clipped_ann.area / ann.area >= min_area_ratio:
            # Shift coordinates to slice-local space
            clipped_ann.bbox -= [slice_bbox.x, slice_bbox.y, 0, 0]
            sliced.append(clipped_ann)
    return sliced

The min_area_ratio parameter (default 0.1) controls the aggressiveness of filtering. A value of 0.1 means annotations must retain at least 10% of their original area after clipping to be included. This prevents training on tiny fragments of objects that happen to barely overlap with a slice boundary.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment