Implementation:NVIDIA DALI EfficientDetPipeline
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, GPU_Computing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete pipeline class for end-to-end EfficientDet data loading and preprocessing provided by the NVIDIA DALI EfficientDet example.
Description
EfficientDetPipeline is a self-contained class that constructs and manages a complete DALI pipeline for the EfficientDet object detection architecture. It reads images and annotations from either TFRecord or COCO format, applies configurable augmentations (GridMask, random horizontal flip, random crop-and-resize), normalizes pixel values, encodes ground-truth bounding boxes against pre-computed multi-scale anchors, and reshapes the encoded targets into per-level feature map outputs.
The constructor accepts a params dictionary (from the EfficientDet configuration), training arguments, and hardware placement options. Internally, it:
- Resolves input files based on the chosen input type (TFRecord glob patterns or COCO directory paths).
- Creates an Anchors object to pre-compute multi-scale anchor boxes normalized to [0, 1] in ltrb format.
- Defines the DALI pipeline graph via the @pipeline_def decorator.
- Exposes the pipeline as a tf.data.Dataset through the get_dataset() method using dali_tf.DALIDataset.
Usage
Instantiate EfficientDetPipeline with the model configuration and call get_dataset() to obtain a TensorFlow dataset. The returned dataset can be passed directly to model.fit() or iterated manually.
from pipeline.dali.efficientdet_pipeline import EfficientDetPipeline
pipeline = EfficientDetPipeline(
params=params,
batch_size=8,
args=args,
is_training=True,
num_shards=1,
device_id=0,
cpu_only=False,
)
dataset = pipeline.get_dataset()
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/efficientdet_pipeline.py
Signature
class EfficientDetPipeline:
def __init__(
self,
params,
batch_size,
args,
is_training=True,
num_shards=1,
device_id=0,
cpu_only=False,
):
Import
from pipeline.dali.efficientdet_pipeline import EfficientDetPipeline
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| params | dict | Yes | EfficientDet configuration dictionary containing image_size, grid_mask, max_instances_per_image, and seed. |
| batch_size | int | Yes | Number of samples per batch. |
| args | namedtuple | Yes | Training arguments including input_type, file patterns (train_file_pattern, eval_file_pattern), or COCO paths (images_path, annotations_path). |
| is_training | bool | No | Whether to apply training augmentations and random shuffling. Defaults to True. |
| num_shards | int | No | Total number of data-parallel shards for distributed training. Defaults to 1. |
| device_id | int | No | GPU device index for pipeline placement. Defaults to 0. |
| cpu_only | bool | No | If True, forces all operations to CPU. Defaults to False. |
Outputs
| Name | Type | Description |
|---|---|---|
| EfficientDetPipeline instance | EfficientDetPipeline | Object with get_dataset(), build(), and run() methods. |
| get_dataset() return | tf.data.Dataset | TensorFlow dataset yielding (images, num_positives, bboxes, classes, *enc_layers) per batch. |
Usage Examples
Single-GPU Training Pipeline
import tensorflow as tf
from pipeline.dali.efficientdet_pipeline import EfficientDetPipeline
params = {
"image_size": (512, 512),
"grid_mask": True,
"max_instances_per_image": 100,
"seed": 42,
}
# args is a namedtuple with input_type, train_file_pattern, etc.
pipeline = EfficientDetPipeline(
params=params,
batch_size=16,
args=args,
is_training=True,
num_shards=1,
device_id=0,
)
train_dataset = pipeline.get_dataset()
model.fit(train_dataset, epochs=300, steps_per_epoch=2000)
Multi-GPU with MirroredStrategy
import tensorflow as tf
from pipeline.dali.efficientdet_pipeline import EfficientDetPipeline
def dali_dataset_fn(input_context):
device_id = input_context.input_pipeline_id
num_shards = input_context.num_input_pipelines
with tf.device(f"/gpu:{device_id}"):
return EfficientDetPipeline(
params, batch_size // num_shards, args,
is_training=True, num_shards=num_shards, device_id=device_id,
).get_dataset()
strategy = tf.distribute.MirroredStrategy()
dataset = strategy.distribute_datasets_from_function(dali_dataset_fn)