Implementation:Obss Sahi Coco From Coco Dict Or Path
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Data_Engineering, COCO_Format |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Concrete tool for parsing COCO-format annotation files into structured Python objects provided by the SAHI library.
Description
Coco.from_coco_dict_or_path() is a class method that constructs a Coco object from either a COCO-format Python dict or a file path to a COCO JSON file. It:
- Loads the JSON file (if a path is given) via load_json()
- Extracts and registers categories via add_categories_from_coco_category_list()
- Builds an image_id-to-annotation-list mapping for O(1) lookup
- Iterates over each image dict, creating CocoImage objects
- For each image, creates CocoAnnotation objects from the annotation dicts, applying optional category remapping
- Handles duplicate image IDs gracefully (warns and skips)
The method supports optional multithreading (use_threads=True) for large datasets, splitting the image list into chunks processed in parallel.
Usage
Use this as the first step in any SAHI COCO dataset workflow: slicing, merging, splitting, evaluation, or conversion. Pass either a file path string or an already-loaded Python dict.
Code Reference
Source Location
- Repository: sahi
- File: sahi/utils/coco.py
- Lines: L952-1100
Signature
class Coco:
@classmethod
def from_coco_dict_or_path(
cls,
coco_dict_or_path: dict | str,
image_dir: str | None = None,
remapping_dict: dict | None = None,
ignore_negative_samples: bool = False,
clip_bboxes_to_img_dims: bool = False,
use_threads: bool = False,
num_threads: int = 10,
) -> "Coco":
"""Create Coco object from COCO dict or JSON path.
Args:
coco_dict_or_path: COCO dict or path to JSON file
image_dir: Base directory containing images
remapping_dict: Category ID remapping {old_id: new_id}
ignore_negative_samples: Skip images without annotations
clip_bboxes_to_img_dims: Clip bboxes to image boundaries
use_threads: Enable multithreaded loading
num_threads: Number of threads for parallel loading
"""
Import
from sahi.utils.coco import Coco
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| coco_dict_or_path | dict or str | Yes | COCO-format dict or path to COCO JSON file |
| image_dir | str | No | Base directory containing dataset images |
| remapping_dict | dict | No | Maps old category IDs to new IDs |
| ignore_negative_samples | bool | No | Skip images without annotations (default False) |
| clip_bboxes_to_img_dims | bool | No | Clip bounding boxes to image dimensions (default False) |
| use_threads | bool | No | Use multithreaded loading (default False) |
| num_threads | int | No | Number of threads when use_threads=True (default 10) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | Coco | Coco object with .images (list of CocoImage), .categories, .category_mapping (dict of id to name) |
Usage Examples
Load from File Path
from sahi.utils.coco import Coco
# Load COCO dataset from JSON file
coco = Coco.from_coco_dict_or_path(
coco_dict_or_path="path/to/annotations/train.json",
image_dir="path/to/images/",
)
print(f"Loaded {len(coco.images)} images")
print(f"Categories: {coco.category_mapping}")
# Access individual images and their annotations
for coco_image in coco.images[:5]:
print(f"Image: {coco_image.file_name}, Annotations: {len(coco_image.annotations)}")
Load with Category Remapping
from sahi.utils.coco import Coco
# Remap category IDs (useful when merging datasets)
coco = Coco.from_coco_dict_or_path(
coco_dict_or_path="dataset.json",
remapping_dict={1: 0, 2: 1, 3: 2}, # remap IDs
ignore_negative_samples=True, # skip empty images
)