Implementation:Obss Sahi Coco2yolo
| Knowledge Sources | |
|---|---|
| Domains | Dataset, Format_Conversion |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
CLI script that converts COCO-formatted annotation datasets to YOLO format with configurable train/val split.
Description
The coco2yolo script provides a command-line interface for converting COCO JSON annotation files into YOLO-formatted label files. It loads the COCO dataset using Coco.from_coco_dict_or_path(), then exports to YOLO format via coco.export_as_yolo(). The output includes properly split train/val directories with auto-incrementing experiment names.
Usage
Run this script when you need to convert COCO-formatted datasets into YOLO format for training with YOLOv5, Ultralytics, or other YOLO-family detectors.
Code Reference
Source Location
- Repository: Obss_Sahi
- File: sahi/scripts/coco2yolo.py
- Lines: 1-48
Signature
def main(
image_dir: str,
dataset_json_path: str,
train_split: int | float = 0.9,
project: str = "runs/coco2yolo",
name: str = "exp",
seed: int = 1,
disable_symlink=False,
) -> None:
"""
Args:
images_dir (str): directory for coco images
dataset_json_path (str): file path for the coco json file to be converted
train_split (float or int): set the training split ratio
project (str): save results to project/name
name (str): save results to project/name
seed (int): fix the seed for reproducibility
disable_symlink (bool): required in google colab env
"""
Import
from sahi.scripts.coco2yolo import main
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| image_dir | str | Yes | Directory containing COCO images |
| dataset_json_path | str | Yes | Path to COCO annotation JSON file |
| train_split | float or int | No (default 0.9) | Train/val split ratio |
| project | str | No (default "runs/coco2yolo") | Output project directory |
| name | str | No (default "exp") | Experiment name within project |
| seed | int | No (default 1) | Random seed for split reproducibility |
| disable_symlink | bool | No (default False) | Use file copy instead of symlinks (for Colab) |
Outputs
| Name | Type | Description |
|---|---|---|
| YOLO dataset | Directory | YOLO-formatted labels and images in project/name/ |
Usage Examples
CLI Usage
python -m sahi.scripts.coco2yolo \
--image_dir /data/coco/images \
--dataset_json_path /data/coco/annotations.json \
--train_split 0.8 \
--project runs/coco2yolo \
--name my_dataset \
--seed 42
Programmatic Usage
from sahi.scripts.coco2yolo import main
main(
image_dir="/data/coco/images",
dataset_json_path="/data/coco/annotations.json",
train_split=0.9,
project="runs/coco2yolo",
name="exp",
seed=1,
)