Implementation:Roboflow Rf detr Build Dataset
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Data_Engineering |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for loading and building object detection datasets provided by the RF-DETR library.
Description
build_dataset is a factory function that creates PyTorch Dataset objects based on the specified format. It delegates to format-specific builders: build_coco for COCO JSON, build_roboflow_from_yolo for YOLO, and build_roboflow for auto-detected Roboflow datasets. Helper functions is_valid_coco_dataset and is_valid_yolo_dataset validate directory structure before loading.
Usage
Called internally by the training pipeline. Users specify the dataset format and directory path via training configuration parameters.
Code Reference
Source Location
- Repository: rf-detr
- File: rfdetr/datasets/__init__.py
- Lines: L86-95 (build_dataset)
- File: rfdetr/datasets/coco.py
- Lines: L33-34 (is_valid_coco_dataset)
- File: rfdetr/datasets/yolo.py
- Lines: L27-47 (is_valid_yolo_dataset)
Signature
def build_dataset(
image_set: str,
args: Any,
resolution: int
) -> torch.utils.data.Dataset:
"""
Build a dataset for the given split.
Args:
image_set: Split name ("train", "val", "test")
args: Namespace with dataset_file and dataset_dir
resolution: Input resolution for transforms
Returns:
PyTorch Dataset with images and annotations
"""
def is_valid_coco_dataset(dataset_dir: str) -> bool:
"""Check if directory contains valid COCO format annotations."""
def is_valid_yolo_dataset(dataset_dir: str) -> bool:
"""Check if directory contains valid YOLO format annotations."""
Import
from rfdetr.datasets import build_dataset
from rfdetr.datasets.coco import is_valid_coco_dataset
from rfdetr.datasets.yolo import is_valid_yolo_dataset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| image_set | str | Yes | Split name: "train", "val", or "test" |
| args.dataset_file | str | Yes | Format: "coco", "roboflow", or "yolo" |
| args.dataset_dir | str | Yes | Root directory path of the dataset |
| resolution | int | Yes | Input resolution for transforms |
Outputs
| Name | Type | Description |
|---|---|---|
| dataset | torch.utils.data.Dataset | PyTorch Dataset with images, targets (boxes, labels), and transforms applied |
Usage Examples
Validate Dataset Format
from rfdetr.datasets.coco import is_valid_coco_dataset
from rfdetr.datasets.yolo import is_valid_yolo_dataset
dataset_dir = "/path/to/my_dataset"
if is_valid_coco_dataset(dataset_dir):
print("COCO format detected")
elif is_valid_yolo_dataset(dataset_dir):
print("YOLO format detected")