Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Obss Sahi Create Coco Dict

From Leeroopedia


Knowledge Sources
Domains Object_Detection, Data_Engineering, COCO_Format
Last Updated 2026-02-08 12:00 GMT

Overview

Concrete tool for assembling in-memory COCO image and annotation objects into a standard COCO-format dictionary provided by the SAHI library.

Description

create_coco_dict() takes a list of CocoImage objects (each containing CocoAnnotation objects) and a category list, and produces a standard COCO-format dictionary with images, annotations, and categories fields.

It handles:

  • Auto image IDs: Sequential assignment starting from 1 (when image_id_setting="auto")
  • Manual image IDs: Uses CocoImage.id directly (when image_id_setting="manual")
  • Negative sample filtering: Skips images without annotations when ignore_negative_samples=True
  • Annotation ID assignment: Sequential unique IDs across all annotations

The resulting dict is typically written to disk via save_json() which handles numpy type serialization via a custom NumpyEncoder.

Usage

Use this function as the final assembly step after any COCO dataset manipulation (slicing, merging, filtering). It is called internally by slice_coco() and can be used standalone for custom dataset construction.

Code Reference

Source Location

  • Repository: sahi
  • File: sahi/utils/coco.py
  • Lines: L1920-1991

Signature

def create_coco_dict(
    images: list,           # List of CocoImage
    categories: list[dict], # COCO category dicts
    ignore_negative_samples: bool = False,
    image_id_setting: str = "auto",
) -> dict:
    """Create COCO dict from CocoImage objects.

    Args:
        images: List of CocoImage with annotations
        categories: COCO category list
        ignore_negative_samples: Skip images without annotations
        image_id_setting: "auto" (sequential) or "manual" (use CocoImage.id)

    Returns:
        COCO-format dict with "images", "annotations", "categories"
    """

Import

from sahi.utils.coco import create_coco_dict

I/O Contract

Inputs

Name Type Required Description
images list[CocoImage] Yes List of CocoImage objects, each with .annotations list
categories list[dict] Yes COCO category dicts from original dataset
ignore_negative_samples bool No Skip images with no annotations (default False)
image_id_setting str No "auto" for sequential IDs, "manual" for CocoImage.id (default "auto")

Outputs

Name Type Description
return dict COCO-format dict with "images" (list), "annotations" (list), "categories" (list) fields

Usage Examples

Basic Export

from sahi.utils.coco import Coco, create_coco_dict
from sahi.utils.file import save_json

# Load and manipulate dataset
coco = Coco.from_coco_dict_or_path("train.json")

# Create COCO dict from modified images
coco_dict = create_coco_dict(
    images=coco.images,
    categories=coco.json_categories,
    ignore_negative_samples=False,
)

# Save to disk
save_json(coco_dict, "modified_train.json")
print(f"Exported {len(coco_dict['images'])} images, {len(coco_dict['annotations'])} annotations")

Export After Slicing

from sahi.slicing import slice_image
from sahi.utils.coco import create_coco_dict
from sahi.utils.file import save_json

# Collect sliced images from multiple source images
all_sliced_images = []
for coco_image in coco.images:
    result = slice_image(
        image=f"images/{coco_image.file_name}",
        coco_annotation_list=coco_image.annotations,
        slice_height=512,
        slice_width=512,
    )
    all_sliced_images.extend(result.coco_images)

# Assemble and export
coco_dict = create_coco_dict(
    images=all_sliced_images,
    categories=coco.json_categories,
    ignore_negative_samples=True,
)
save_json(coco_dict, "sliced_train.json")

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment