Implementation:NVIDIA DALI Ops Util Gridmask
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, GPU_Computing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete function for applying GridMask augmentation to images within a DALI pipeline, provided by the DALI EfficientDet example utility module.
Description
The ops_util.gridmask function applies the GridMask regularization technique to decoded images inside a DALI pipeline graph. It uses several DALI random number generators and the built-in dali.fn.grid_mask operator.
The function operates as follows:
- A random coin flip (dali.fn.random.coin_flip) determines whether the mask is applied to each sample. The mask ratio is set to 0.4 * p where p is the coin flip result (0 or 1), effectively toggling the augmentation on or off per sample.
- The rotation angle is sampled from a normal distribution (dali.fn.random.normal) with mean=-1 and stddev=1, scaled by 10 degrees converted to radians.
- The tile size is computed as a random uniform value between bounds derived from the image dimensions: lower = min(0.5 * height, 0.3 * width) and upper = max(0.5 * height, 0.3 * width). The result is cast to int32.
- The dali.fn.grid_mask operator is called with the computed ratio, angle, and tile parameters.
Note: In the source code, the function computes the grid mask but returns the original images variable rather than the gridmask result. This appears to be a bug in the example code. The dali.fn.grid_mask call still executes within the pipeline graph, but its output is not propagated.
Usage
Call ops_util.gridmask inside a DALI pipeline definition, typically after input reading and before normalization. It is only applied during training when the grid_mask parameter is enabled.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/tensorflow/efficientdet/pipeline/dali/ops_util.py
Signature
def gridmask(images, widths, heights):
...
Import
from pipeline.dali import ops_util
# Called as:
images = ops_util.gridmask(images, widths, heights)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | DALI TensorList | Yes | Decoded RGB images (uint8 or float32) to augment. |
| widths | DALI TensorList | Yes | Original image widths as float32 scalars, used to compute tile size bounds. |
| heights | DALI TensorList | Yes | Original image heights as float32 scalars, used to compute tile size bounds. |
Outputs
| Name | Type | Description |
|---|---|---|
| images | DALI TensorList | Images with GridMask augmentation applied (or original images if coin flip is 0). Same shape and dtype as input. |
Usage Examples
Applying GridMask in a Training Pipeline
# Inside a @pipeline_def function:
images, bboxes, classes, widths, heights = ops_util.input_coco(...)
# Apply GridMask augmentation (training only)
if is_training and params["grid_mask"]:
images = ops_util.gridmask(images, widths, heights)
# Continue with normalization and cropping
images, bboxes = ops_util.normalize_flip(images, bboxes, p=0.5)
Understanding the Random Parameters
import nvidia.dali as dali
import math
# The internal computation for each sample:
p = dali.fn.random.coin_flip() # 0 or 1
ratio = 0.4 * p # 0.0 or 0.4
angle = dali.fn.random.normal(mean=-1, stddev=1) * 10.0 * (math.pi / 180.0)
l = dali.math.min(0.5 * heights, 0.3 * widths)
r = dali.math.max(0.5 * heights, 0.3 * widths)
tile = dali.fn.cast(
(dali.fn.random.uniform(range=[0.0, 1.0]) * (r - l) + l),
dtype=dali.types.INT32,
)
result = dali.fn.grid_mask(images, ratio=ratio, angle=angle, tile=tile)