Implementation:NVIDIA DALI Ops Util Gridmask

Knowledge Sources	NVIDIA DALI
Domains	Object_Detection, GPU_Computing
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete function for applying GridMask augmentation to images within a DALI pipeline, provided by the DALI EfficientDet example utility module.

Description

The ops_util.gridmask function applies the GridMask regularization technique to decoded images inside a DALI pipeline graph. It uses several DALI random number generators and the built-in dali.fn.grid_mask operator.

The function operates as follows:

A random coin flip (dali.fn.random.coin_flip) determines whether the mask is applied to each sample. The mask ratio is set to 0.4 * p where p is the coin flip result (0 or 1), effectively toggling the augmentation on or off per sample.
The rotation angle is sampled from a normal distribution (dali.fn.random.normal) with mean=-1 and stddev=1, scaled by 10 degrees converted to radians.
The tile size is computed as a random uniform value between bounds derived from the image dimensions: lower = min(0.5 * height, 0.3 * width) and upper = max(0.5 * height, 0.3 * width). The result is cast to int32.
The dali.fn.grid_mask operator is called with the computed ratio, angle, and tile parameters.

Note: In the source code, the function computes the grid mask but returns the original images variable rather than the gridmask result. This appears to be a bug in the example code. The dali.fn.grid_mask call still executes within the pipeline graph, but its output is not propagated.

Usage

Call ops_util.gridmask inside a DALI pipeline definition, typically after input reading and before normalization. It is only applied during training when the grid_mask parameter is enabled.

Code Reference

Source Location

Repository: NVIDIA DALI
File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py

Signature

def gridmask(images, widths, heights):
    ...

Import

from pipeline.dali import ops_util

# Called as:
images = ops_util.gridmask(images, widths, heights)

I/O Contract

Inputs

Name	Type	Required	Description
images	DALI TensorList	Yes	Decoded RGB images (uint8 or float32) to augment.
widths	DALI TensorList	Yes	Original image widths as float32 scalars, used to compute tile size bounds.
heights	DALI TensorList	Yes	Original image heights as float32 scalars, used to compute tile size bounds.

Outputs

Name	Type	Description
images	DALI TensorList	Images with GridMask augmentation applied (or original images if coin flip is 0). Same shape and dtype as input.

Usage Examples

Applying GridMask in a Training Pipeline

# Inside a @pipeline_def function:
images, bboxes, classes, widths, heights = ops_util.input_coco(...)

# Apply GridMask augmentation (training only)
if is_training and params["grid_mask"]:
    images = ops_util.gridmask(images, widths, heights)

# Continue with normalization and cropping
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)

Understanding the Random Parameters

import nvidia.dali as dali
import math

# The internal computation for each sample:
p = dali.fn.random.coin_flip()          # 0 or 1
ratio = 0.4 * p                          # 0.0 or 0.4

angle = dali.fn.random.normal(mean=-1, stddev=1) * 10.0 * (math.pi / 180.0)

l = dali.math.min(0.5 * heights, 0.3 * widths)
r = dali.math.max(0.5 * heights, 0.3 * widths)
tile = dali.fn.cast(
    (dali.fn.random.uniform(range=[0.0, 1.0]) * (r - l) + l),
    dtype=dali.types.INT32,
)

result = dali.fn.grid_mask(images, ratio=ratio, angle=angle, tile=tile)

Related Pages

Implements Principle

Principle:NVIDIA_DALI_GridMask_Augmentation

Requires Environment

Environment:NVIDIA_DALI_CUDA_GPU_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment