Implementation:NVIDIA DALI Ops Util Input Readers
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, GPU_Computing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete input reader functions for loading object detection data from TFRecord and COCO formats provided by the DALI EfficientDet example utility module.
Description
The ops_util input readers consist of two functions -- input_tfrecord and input_coco -- that abstract dataset-specific parsing into a common 5-tuple output format. Both functions return (images, bboxes, classes, widths, heights) regardless of the source format.
input_tfrecord uses dali.fn.readers.tfrecord to read serialized TFRecord files with a feature schema that includes encoded image bytes, per-coordinate bounding box arrays (xmin, ymin, xmax, ymax), class labels, and image dimensions. It stacks the four coordinate arrays into an [N, 4] tensor using dali.fn.stack and dali.fn.transpose, decodes images via dali.fn.decoders.image, and casts types as needed.
input_coco uses dali.fn.readers.coco with ratio=True (normalized coordinates) and ltrb=True (left-top-right-bottom format). It reads directly from an image directory and a JSON annotations file. Image dimensions are obtained via dali.fn.peek_image_shape on the encoded bytes before decoding.
Both readers support sharded reading (shard_id, num_shards) for distributed training and optional random shuffling.
Usage
Call input_tfrecord or input_coco inside a DALI pipeline definition function. The returned tuple feeds directly into downstream augmentation functions.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py
Signature
def input_tfrecord(
tfrecord_files, tfrecord_idxs, device, shard_id, num_shards, random_shuffle=True
):
...
def input_coco(
images_path, annotations_path, device, shard_id, num_shards, random_shuffle=True
):
...
Import
from pipeline.dali import ops_util
# Called as:
images, bboxes, classes, widths, heights = ops_util.input_tfrecord(...)
images, bboxes, classes, widths, heights = ops_util.input_coco(...)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| tfrecord_files | list[str] | Yes (TFRecord) | List of TFRecord file paths. |
| tfrecord_idxs | list[str] | Yes (TFRecord) | List of TFRecord index file paths (each is the corresponding TFRecord path with _idx suffix). |
| images_path | str | Yes (COCO) | Root directory containing COCO images. |
| annotations_path | str | Yes (COCO) | Path to the COCO JSON annotations file. |
| device | str | Yes | Target device for image decoding: "cpu" or "gpu" (uses mixed decoding when GPU). |
| shard_id | int or None | Yes | Index of the current data shard for distributed reading. None for single-device mode. |
| num_shards | int | Yes | Total number of data shards. |
| random_shuffle | bool | No | Whether to randomly shuffle the data. Defaults to True. |
Outputs
| Name | Type | Description |
|---|---|---|
| images | DALI TensorList | Decoded RGB images as uint8 tensors of shape [H, W, 3]. |
| bboxes | DALI TensorList | Bounding boxes as float32 tensors of shape [N, 4] in ltrb format (xmin, ymin, xmax, ymax), normalized to [0, 1]. |
| classes | DALI TensorList | Class labels as int32 tensors of shape [N]. |
| widths | DALI TensorList | Original image widths as float32 scalars. |
| heights | DALI TensorList | Original image heights as float32 scalars. |
Usage Examples
Reading TFRecord Data
from glob import glob
tfrecord_files = glob("/data/coco/train-*.tfrecord")
tfrecord_idxs = [f + "_idx" for f in tfrecord_files]
images, bboxes, classes, widths, heights = ops_util.input_tfrecord(
tfrecord_files,
tfrecord_idxs,
device="gpu",
shard_id=0,
num_shards=1,
random_shuffle=True,
)
Reading COCO Data
images, bboxes, classes, widths, heights = ops_util.input_coco(
images_path="/data/coco/train2017",
annotations_path="/data/coco/annotations/instances_train2017.json",
device="gpu",
shard_id=0,
num_shards=1,
random_shuffle=True,
)