Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:NVIDIA DALI Ops Util Gridmask

From Leeroopedia


Knowledge Sources
Domains Object_Detection, GPU_Computing
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete function for applying GridMask augmentation to images within a DALI pipeline, provided by the DALI EfficientDet example utility module.

Description

The ops_util.gridmask function applies the GridMask regularization technique to decoded images inside a DALI pipeline graph. It uses several DALI random number generators and the built-in dali.fn.grid_mask operator.

The function operates as follows:

  1. A random coin flip (dali.fn.random.coin_flip) determines whether the mask is applied to each sample. The mask ratio is set to 0.4 * p where p is the coin flip result (0 or 1), effectively toggling the augmentation on or off per sample.
  2. The rotation angle is sampled from a normal distribution (dali.fn.random.normal) with mean=-1 and stddev=1, scaled by 10 degrees converted to radians.
  3. The tile size is computed as a random uniform value between bounds derived from the image dimensions: lower = min(0.5 * height, 0.3 * width) and upper = max(0.5 * height, 0.3 * width). The result is cast to int32.
  4. The dali.fn.grid_mask operator is called with the computed ratio, angle, and tile parameters.

Note: In the source code, the function computes the grid mask but returns the original images variable rather than the gridmask result. This appears to be a bug in the example code. The dali.fn.grid_mask call still executes within the pipeline graph, but its output is not propagated.

Usage

Call ops_util.gridmask inside a DALI pipeline definition, typically after input reading and before normalization. It is only applied during training when the grid_mask parameter is enabled.

Code Reference

Source Location

  • Repository: NVIDIA DALI
  • File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py

Signature

def gridmask(images, widths, heights):
    ...

Import

from pipeline.dali import ops_util

# Called as:
images = ops_util.gridmask(images, widths, heights)

I/O Contract

Inputs

Name Type Required Description
images DALI TensorList Yes Decoded RGB images (uint8 or float32) to augment.
widths DALI TensorList Yes Original image widths as float32 scalars, used to compute tile size bounds.
heights DALI TensorList Yes Original image heights as float32 scalars, used to compute tile size bounds.

Outputs

Name Type Description
images DALI TensorList Images with GridMask augmentation applied (or original images if coin flip is 0). Same shape and dtype as input.

Usage Examples

Applying GridMask in a Training Pipeline

# Inside a @pipeline_def function:
images, bboxes, classes, widths, heights = ops_util.input_coco(...)

# Apply GridMask augmentation (training only)
if is_training and params["grid_mask"]:
    images = ops_util.gridmask(images, widths, heights)

# Continue with normalization and cropping
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)

Understanding the Random Parameters

import nvidia.dali as dali
import math

# The internal computation for each sample:
p = dali.fn.random.coin_flip()          # 0 or 1
ratio = 0.4 * p                          # 0.0 or 0.4

angle = dali.fn.random.normal(mean=-1, stddev=1) * 10.0 * (math.pi / 180.0)

l = dali.math.min(0.5 * heights, 0.3 * widths)
r = dali.math.max(0.5 * heights, 0.3 * widths)
tile = dali.fn.cast(
    (dali.fn.random.uniform(range=[0.0, 1.0]) * (r - l) + l),
    dtype=dali.types.INT32,
)

result = dali.fn.grid_mask(images, ratio=ratio, angle=angle, tile=tile)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment