Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:NVIDIA DALI Ops Util Normalize Flip

From Leeroopedia
Revision as of 15:54, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/NVIDIA_DALI_Ops_Util_Normalize_Flip.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Object_Detection, GPU_Computing
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete functions for normalizing images with coordinated horizontal flipping and random crop-resize for object detection, provided by the DALI EfficientDet example utility module.

Description

The ops_util normalize_flip and random_crop_resize functions implement the spatial augmentation stage of the EfficientDet DALI pipeline.

normalize_flip performs two operations in a single step:

  1. Generates a random coin flip with probability p (0.5 for training, 0.0 for evaluation).
  2. Applies dali.fn.crop_mirror_normalize to subtract ImageNet channel means, divide by standard deviations, and conditionally mirror the image horizontally.
  3. Applies dali.fn.bb_flip with the same flip flag to transform bounding box coordinates consistently.

random_crop_resize performs scale-aware cropping:

  1. Samples a random scale factor from the given range (e.g., [0.1, 2.0] for training, None for evaluation which uses 1.0).
  2. Computes the resize dimensions to fit the scaled image proportionally to the target output size.
  3. Resizes the image using dali.fn.resize.
  4. Applies dali.fn.random_bbox_crop to jointly crop the image region and transform bounding boxes with the xyXY layout, discarding boxes that fall outside.
  5. Extracts the crop region with dali.fn.slice using out-of-bounds padding.

Usage

These functions are called sequentially inside the DALI pipeline definition, after input reading and optional GridMask augmentation.

Code Reference

Source Location

  • Repository: NVIDIA DALI
  • File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py

Signature

def normalize_flip(images, bboxes, p=0.5):
    ...

def random_crop_resize(
    images, bboxes, classes, widths, heights, output_size, scaling=[0.1, 2.0]
):
    ...

Import

from pipeline.dali import ops_util

# Called as:
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)
images, bboxes, classes = ops_util.random_crop_resize(
    images, bboxes, classes, widths, heights, output_size, scaling
)

I/O Contract

Inputs

Name Type Required Description
images DALI TensorList Yes Decoded RGB images (uint8 or float32).
bboxes DALI TensorList Yes Bounding boxes in normalized ltrb format [N, 4].
p float No Probability of horizontal flip. Defaults to 0.5. Set to 0.0 for evaluation.
classes DALI TensorList Yes (crop_resize) Class labels [N] for filtering during crop.
widths DALI TensorList Yes (crop_resize) Original image widths as float scalars.
heights DALI TensorList Yes (crop_resize) Original image heights as float scalars.
output_size tuple(int, int) Yes (crop_resize) Target (height, width) for the output image.
scaling list[float] or None No Range [min_scale, max_scale] for random scale sampling. None means no scaling (factor = 1.0).

Outputs

Name Type Description
images (normalize_flip) DALI TensorList Normalized float32 images in NHWC layout, optionally flipped.
bboxes (normalize_flip) DALI TensorList Bounding boxes with horizontal flip applied, [N, 4] in ltrb format.
images (crop_resize) DALI TensorList Cropped and resized float32 images of shape [output_size[0], output_size[1], 3].
bboxes (crop_resize) DALI TensorList Bounding boxes transformed to the crop window, [M, 4] where M <= N.
classes (crop_resize) DALI TensorList Filtered class labels [M] corresponding to retained boxes.

Usage Examples

Training Augmentation

# Inside a @pipeline_def function:
images, bboxes, classes, widths, heights = ops_util.input_coco(...)

# Normalize and randomly flip
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)

# Random crop and resize to 512x512
images, bboxes, classes = ops_util.random_crop_resize(
    images, bboxes, classes, widths, heights,
    output_size=(512, 512),
    scaling=[0.1, 2.0],
)

Evaluation (No Augmentation)

# No flip, no random scaling
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.0)
images, bboxes, classes = ops_util.random_crop_resize(
    images, bboxes, classes, widths, heights,
    output_size=(512, 512),
    scaling=None,
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment