Implementation:Obss Sahi Create Coco Dict
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Data_Engineering, COCO_Format |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Concrete tool for assembling in-memory COCO image and annotation objects into a standard COCO-format dictionary provided by the SAHI library.
Description
create_coco_dict() takes a list of CocoImage objects (each containing CocoAnnotation objects) and a category list, and produces a standard COCO-format dictionary with images, annotations, and categories fields.
It handles:
- Auto image IDs: Sequential assignment starting from 1 (when image_id_setting="auto")
- Manual image IDs: Uses CocoImage.id directly (when image_id_setting="manual")
- Negative sample filtering: Skips images without annotations when ignore_negative_samples=True
- Annotation ID assignment: Sequential unique IDs across all annotations
The resulting dict is typically written to disk via save_json() which handles numpy type serialization via a custom NumpyEncoder.
Usage
Use this function as the final assembly step after any COCO dataset manipulation (slicing, merging, filtering). It is called internally by slice_coco() and can be used standalone for custom dataset construction.
Code Reference
Source Location
- Repository: sahi
- File: sahi/utils/coco.py
- Lines: L1920-1991
Signature
def create_coco_dict(
images: list, # List of CocoImage
categories: list[dict], # COCO category dicts
ignore_negative_samples: bool = False,
image_id_setting: str = "auto",
) -> dict:
"""Create COCO dict from CocoImage objects.
Args:
images: List of CocoImage with annotations
categories: COCO category list
ignore_negative_samples: Skip images without annotations
image_id_setting: "auto" (sequential) or "manual" (use CocoImage.id)
Returns:
COCO-format dict with "images", "annotations", "categories"
"""
Import
from sahi.utils.coco import create_coco_dict
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | list[CocoImage] | Yes | List of CocoImage objects, each with .annotations list |
| categories | list[dict] | Yes | COCO category dicts from original dataset |
| ignore_negative_samples | bool | No | Skip images with no annotations (default False) |
| image_id_setting | str | No | "auto" for sequential IDs, "manual" for CocoImage.id (default "auto") |
Outputs
| Name | Type | Description |
|---|---|---|
| return | dict | COCO-format dict with "images" (list), "annotations" (list), "categories" (list) fields |
Usage Examples
Basic Export
from sahi.utils.coco import Coco, create_coco_dict
from sahi.utils.file import save_json
# Load and manipulate dataset
coco = Coco.from_coco_dict_or_path("train.json")
# Create COCO dict from modified images
coco_dict = create_coco_dict(
images=coco.images,
categories=coco.json_categories,
ignore_negative_samples=False,
)
# Save to disk
save_json(coco_dict, "modified_train.json")
print(f"Exported {len(coco_dict['images'])} images, {len(coco_dict['annotations'])} annotations")
Export After Slicing
from sahi.slicing import slice_image
from sahi.utils.coco import create_coco_dict
from sahi.utils.file import save_json
# Collect sliced images from multiple source images
all_sliced_images = []
for coco_image in coco.images:
result = slice_image(
image=f"images/{coco_image.file_name}",
coco_annotation_list=coco_image.annotations,
slice_height=512,
slice_width=512,
)
all_sliced_images.extend(result.coco_images)
# Assemble and export
coco_dict = create_coco_dict(
images=all_sliced_images,
categories=coco.json_categories,
ignore_negative_samples=True,
)
save_json(coco_dict, "sliced_train.json")