Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Obss Sahi Slice Coco

From Leeroopedia


Knowledge Sources
Domains Object_Detection, Data_Engineering, Image_Processing
Last Updated 2026-02-08 12:00 GMT

Overview

Concrete tool for slicing COCO-annotated image datasets into tiled sub-datasets with adjusted annotations provided by the SAHI library.

Description

slice_coco() processes an entire COCO dataset, slicing each image and its annotations into overlapping tiles. For each image:

  1. Loads the COCO annotation file and builds the Coco object
  2. Iterates over each CocoImage with a progress bar (tqdm)
  3. Calls slice_image() per image, passing its CocoAnnotation list for annotation slicing
  4. Collects all sliced CocoImage objects from SliceImageResult.coco_images
  5. Assembles the final COCO dict via create_coco_dict() and saves it via save_json()

Annotation slicing is handled by process_coco_annotations() which clips each annotation to the slice boundary and filters by min_area_ratio. Invalid geometries (TopologicalError from Shapely) are gracefully skipped with a warning.

Sliced images are exported to disk in parallel using ThreadPoolExecutor (within slice_image()).

Usage

Use this function to prepare training datasets for small object detection. Can be invoked programmatically or via the CLI command sahi coco slice.

Code Reference

Source Location

  • Repository: sahi
  • File: sahi/slicing.py
  • Lines: L418-508

Signature

def slice_coco(
    coco_annotation_file_path: str,
    image_dir: str,
    output_coco_annotation_file_name: str,
    output_dir: str | None = None,
    ignore_negative_samples: bool | None = False,
    slice_height: int | None = 512,
    slice_width: int | None = 512,
    overlap_height_ratio: float | None = 0.2,
    overlap_width_ratio: float | None = 0.2,
    min_area_ratio: float | None = 0.1,
    out_ext: str | None = None,
    verbose: bool | None = False,
    exif_fix: bool = True,
) -> list[dict | str]:
    """Slice COCO dataset images and annotations into tiles.

    Args:
        coco_annotation_file_path: Path to COCO annotation JSON
        image_dir: Base directory containing images
        output_coco_annotation_file_name: Output COCO JSON filename
        output_dir: Output directory for sliced images and JSON
        ignore_negative_samples: Skip images without annotations
        slice_height: Tile height (default 512)
        slice_width: Tile width (default 512)
        overlap_height_ratio: Vertical overlap fraction (default 0.2)
        overlap_width_ratio: Horizontal overlap fraction (default 0.2)
        min_area_ratio: Min annotation area ratio to retain (default 0.1)
        out_ext: Extension for saved images
        verbose: Print progress info
        exif_fix: Apply EXIF orientation fix

    Returns:
        Tuple of (coco_dict, save_path)
    """

Import

from sahi.slicing import slice_coco

I/O Contract

Inputs

Name Type Required Description
coco_annotation_file_path str Yes Path to COCO annotation JSON file
image_dir str Yes Directory containing the dataset images
output_coco_annotation_file_name str Yes Filename for the output COCO JSON
output_dir str No Directory for sliced images and JSON output
slice_height int No Tile height in pixels (default 512)
slice_width int No Tile width in pixels (default 512)
overlap_height_ratio float No Vertical overlap fraction (default 0.2)
overlap_width_ratio float No Horizontal overlap fraction (default 0.2)
min_area_ratio float No Min annotation area ratio to retain (default 0.1)
ignore_negative_samples bool No Skip images without annotations (default False)

Outputs

Name Type Description
coco_dict dict COCO-format dict with sliced images and adjusted annotations
save_path str Path where the COCO JSON was saved

Usage Examples

Basic Dataset Slicing

from sahi.slicing import slice_coco

coco_dict, save_path = slice_coco(
    coco_annotation_file_path="train.json",
    image_dir="images/train/",
    output_coco_annotation_file_name="sliced_train",
    output_dir="sliced_dataset/",
    slice_height=640,
    slice_width=640,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
    min_area_ratio=0.1,
)

print(f"Sliced dataset saved to: {save_path}")
print(f"Total sliced images: {len(coco_dict['images'])}")
print(f"Total annotations: {len(coco_dict['annotations'])}")

CLI Usage

sahi coco slice \
    --image_dir images/train/ \
    --dataset_json_path train.json \
    --slice_size 512 \
    --overlap_ratio 0.2 \
    --output_dir sliced_dataset/

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment