Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Roboflow Rf detr Build Dataset

From Leeroopedia


Knowledge Sources
Domains Object_Detection, Data_Engineering
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for loading and building object detection datasets provided by the RF-DETR library.

Description

build_dataset is a factory function that creates PyTorch Dataset objects based on the specified format. It delegates to format-specific builders: build_coco for COCO JSON, build_roboflow_from_yolo for YOLO, and build_roboflow for auto-detected Roboflow datasets. Helper functions is_valid_coco_dataset and is_valid_yolo_dataset validate directory structure before loading.

Usage

Called internally by the training pipeline. Users specify the dataset format and directory path via training configuration parameters.

Code Reference

Source Location

  • Repository: rf-detr
  • File: rfdetr/datasets/__init__.py
  • Lines: L86-95 (build_dataset)
  • File: rfdetr/datasets/coco.py
  • Lines: L33-34 (is_valid_coco_dataset)
  • File: rfdetr/datasets/yolo.py
  • Lines: L27-47 (is_valid_yolo_dataset)

Signature

def build_dataset(
    image_set: str,
    args: Any,
    resolution: int
) -> torch.utils.data.Dataset:
    """
    Build a dataset for the given split.

    Args:
        image_set: Split name ("train", "val", "test")
        args: Namespace with dataset_file and dataset_dir
        resolution: Input resolution for transforms

    Returns:
        PyTorch Dataset with images and annotations
    """

def is_valid_coco_dataset(dataset_dir: str) -> bool:
    """Check if directory contains valid COCO format annotations."""

def is_valid_yolo_dataset(dataset_dir: str) -> bool:
    """Check if directory contains valid YOLO format annotations."""

Import

from rfdetr.datasets import build_dataset
from rfdetr.datasets.coco import is_valid_coco_dataset
from rfdetr.datasets.yolo import is_valid_yolo_dataset

I/O Contract

Inputs

Name Type Required Description
image_set str Yes Split name: "train", "val", or "test"
args.dataset_file str Yes Format: "coco", "roboflow", or "yolo"
args.dataset_dir str Yes Root directory path of the dataset
resolution int Yes Input resolution for transforms

Outputs

Name Type Description
dataset torch.utils.data.Dataset PyTorch Dataset with images, targets (boxes, labels), and transforms applied

Usage Examples

Validate Dataset Format

from rfdetr.datasets.coco import is_valid_coco_dataset
from rfdetr.datasets.yolo import is_valid_yolo_dataset

dataset_dir = "/path/to/my_dataset"

if is_valid_coco_dataset(dataset_dir):
    print("COCO format detected")
elif is_valid_yolo_dataset(dataset_dir):
    print("YOLO format detected")

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment