Implementation:NVIDIA DALI Ops Util Normalize Flip
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, GPU_Computing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete functions for normalizing images with coordinated horizontal flipping and random crop-resize for object detection, provided by the DALI EfficientDet example utility module.
Description
The ops_util normalize_flip and random_crop_resize functions implement the spatial augmentation stage of the EfficientDet DALI pipeline.
normalize_flip performs two operations in a single step:
- Generates a random coin flip with probability p (0.5 for training, 0.0 for evaluation).
- Applies dali.fn.crop_mirror_normalize to subtract ImageNet channel means, divide by standard deviations, and conditionally mirror the image horizontally.
- Applies dali.fn.bb_flip with the same flip flag to transform bounding box coordinates consistently.
random_crop_resize performs scale-aware cropping:
- Samples a random scale factor from the given range (e.g., [0.1, 2.0] for training, None for evaluation which uses 1.0).
- Computes the resize dimensions to fit the scaled image proportionally to the target output size.
- Resizes the image using dali.fn.resize.
- Applies dali.fn.random_bbox_crop to jointly crop the image region and transform bounding boxes with the xyXY layout, discarding boxes that fall outside.
- Extracts the crop region with dali.fn.slice using out-of-bounds padding.
Usage
These functions are called sequentially inside the DALI pipeline definition, after input reading and optional GridMask augmentation.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py
Signature
def normalize_flip(images, bboxes, p=0.5):
...
def random_crop_resize(
images, bboxes, classes, widths, heights, output_size, scaling=[0.1, 2.0]
):
...
Import
from pipeline.dali import ops_util
# Called as:
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)
images, bboxes, classes = ops_util.random_crop_resize(
images, bboxes, classes, widths, heights, output_size, scaling
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | DALI TensorList | Yes | Decoded RGB images (uint8 or float32). |
| bboxes | DALI TensorList | Yes | Bounding boxes in normalized ltrb format [N, 4]. |
| p | float | No | Probability of horizontal flip. Defaults to 0.5. Set to 0.0 for evaluation. |
| classes | DALI TensorList | Yes (crop_resize) | Class labels [N] for filtering during crop. |
| widths | DALI TensorList | Yes (crop_resize) | Original image widths as float scalars. |
| heights | DALI TensorList | Yes (crop_resize) | Original image heights as float scalars. |
| output_size | tuple(int, int) | Yes (crop_resize) | Target (height, width) for the output image. |
| scaling | list[float] or None | No | Range [min_scale, max_scale] for random scale sampling. None means no scaling (factor = 1.0). |
Outputs
| Name | Type | Description |
|---|---|---|
| images (normalize_flip) | DALI TensorList | Normalized float32 images in NHWC layout, optionally flipped. |
| bboxes (normalize_flip) | DALI TensorList | Bounding boxes with horizontal flip applied, [N, 4] in ltrb format. |
| images (crop_resize) | DALI TensorList | Cropped and resized float32 images of shape [output_size[0], output_size[1], 3]. |
| bboxes (crop_resize) | DALI TensorList | Bounding boxes transformed to the crop window, [M, 4] where M <= N. |
| classes (crop_resize) | DALI TensorList | Filtered class labels [M] corresponding to retained boxes. |
Usage Examples
Training Augmentation
# Inside a @pipeline_def function:
images, bboxes, classes, widths, heights = ops_util.input_coco(...)
# Normalize and randomly flip
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)
# Random crop and resize to 512x512
images, bboxes, classes = ops_util.random_crop_resize(
images, bboxes, classes, widths, heights,
output_size=(512, 512),
scaling=[0.1, 2.0],
)
Evaluation (No Augmentation)
# No flip, no random scaling
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.0)
images, bboxes, classes = ops_util.random_crop_resize(
images, bboxes, classes, widths, heights,
output_size=(512, 512),
scaling=None,
)