Principle:Obss Sahi COCO Dataset Slicing
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Data_Engineering, Image_Processing |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
A dataset preparation technique that tiles COCO-annotated images and their corresponding annotations into smaller patches, producing a new COCO dataset optimized for training small object detection models.
Description
While image slicing during inference improves small object detection at test time, training models on sliced data can further improve performance. COCO dataset slicing applies the same tiling strategy to the training set, producing a new dataset where:
- Each original image is divided into overlapping tiles
- Annotations (bounding boxes and segmentation masks) are clipped to each tile boundary
- Annotations that fall below a minimum area ratio after clipping are discarded
- The sliced images and adjusted annotations are exported as a new COCO-format dataset
This principle ensures that the model trains on the same data distribution it will see during sliced inference, improving detection accuracy for small objects.
Key challenges addressed:
- Annotation clipping: Bounding boxes and segmentation polygons must be geometrically intersected with the slice boundary
- Area filtering: Heavily clipped annotations (where most of the object falls outside the slice) are removed to avoid training on incomplete objects
- Coordinate adjustment: Annotation coordinates are transformed from full-image space to slice-local space
- Multi-scale support: Multiple slice sizes can be specified to create a multi-scale training dataset
Usage
Use COCO dataset slicing when preparing training data for small object detection tasks. This is typically done offline as a preprocessing step before training. Common use cases include:
- Aerial/satellite imagery where objects are very small
- Surveillance datasets with distant subjects
- Any dataset where the target objects are small relative to image resolution
Theoretical Basis
The annotation slicing process for each tile:
# Pseudocode for annotation slicing
def slice_annotations(annotations, slice_bbox, min_area_ratio):
sliced = []
for ann in annotations:
if not overlaps(ann.bbox, slice_bbox):
continue
clipped_ann = clip_to_boundary(ann, slice_bbox)
if clipped_ann.area / ann.area >= min_area_ratio:
# Shift coordinates to slice-local space
clipped_ann.bbox -= [slice_bbox.x, slice_bbox.y, 0, 0]
sliced.append(clipped_ann)
return sliced
The min_area_ratio parameter (default 0.1) controls the aggressiveness of filtering. A value of 0.1 means annotations must retain at least 10% of their original area after clipping to be included. This prevents training on tiny fragments of objects that happen to barely overlap with a slice boundary.