Implementation:NVIDIA DALI Fn Resize
| Knowledge Sources | |
|---|---|
| Domains | Data_Pipeline, GPU_Computing, Image_Augmentation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The fn.resize, fn.flip, and nvidia.dali.auto_aug operators in NVIDIA DALI that perform GPU-accelerated image resizing to target dimensions, stochastic horizontal flipping, and automatic augmentation policies (AutoAugment, TrivialAugment) as part of the training data preprocessing pipeline.
Description
This implementation covers the spatial transformation and augmentation stage of a DALI pipeline, consisting of three related operators:
fn.resize rescales decoded image tensors to target dimensions. Two calling conventions are used in the DALI examples:
- Training mode (ResNet50):
fn.resize(images, resize_x=crop, resize_y=crop, interp_type=types.INTERP_TRIANGULAR)resizes directly to the exact crop dimensions (e.g., 224x224) since the preceding decode-random-crop already selected a region. - Training mode (EfficientNet):
fn.resize(images, size=[image_size, image_size], interp_type=interpolation, antialias=False)resizes to the target image size using the size parameter. - Validation mode:
fn.resize(images, size=size, mode="not_smaller", interp_type=types.INTERP_TRIANGULAR)resizes the shorter side to the target size while preserving aspect ratio, to be followed by a center crop.
fn.flip applies horizontal mirroring controlled by a random coin flip: fn.flip(images, horizontal=fn.random.coin_flip(probability=0.5)). In the ResNet50 example, the flip is instead applied within fn.crop_mirror_normalize via the mirror parameter.
nvidia.dali.auto_aug provides learned or randomized augmentation policies. The EfficientNet example supports auto_augment.auto_augment_image_net(images, shape=[image_size, image_size]) and trivial_augment.trivial_augment_wide(images, shape=[image_size, image_size]), applying sequences of geometric and photometric transformations.
Usage
Use fn.resize after image decoding to bring all images to a uniform spatial resolution. Use fn.flip or the mirror parameter for random horizontal augmentation during training. Use auto_aug policies for more aggressive augmentation strategies that improve model generalization, especially for EfficientNet-family architectures.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/pytorch/resnet50/main.py (lines 134-140)
- File: docs/examples/use_cases/pytorch/efficientnet/image_classification/dali.py (lines 43-70)
Signature (ResNet50 Training)
images = fn.resize(
images,
resize_x=crop,
resize_y=crop,
interp_type=types.INTERP_TRIANGULAR,
)
mirror = fn.random.coin_flip(probability=0.5)
Signature (EfficientNet Training)
images = fn.resize(
images,
size=[image_size, image_size],
interp_type=interpolation,
antialias=False,
)
# Make sure that from this point we are processing on GPU
images = images.gpu()
rng = fn.random.coin_flip(probability=0.5)
images = fn.flip(images, horizontal=rng)
# Automatic augmentation (optional)
if automatic_augmentation == "autoaugment":
output = auto_augment.auto_augment_image_net(
images, shape=[image_size, image_size]
)
elif automatic_augmentation == "trivialaugment":
output = trivial_augment.trivial_augment_wide(
images, shape=[image_size, image_size]
)
else:
output = images
Signature (ResNet50 Validation)
images = fn.resize(
images,
size=size,
mode="not_smaller",
interp_type=types.INTERP_TRIANGULAR,
)
mirror = False
Import
import nvidia.dali.fn as fn
import nvidia.dali.types as types
from nvidia.dali.auto_aug import auto_augment, trivial_augment
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | DataNode | Yes | Decoded image tensor [H, W, 3] from the decoder stage |
| resize_x | int/float | No | Target width in pixels (used with resize_y for exact dimensions) |
| resize_y | int/float | No | Target height in pixels (used with resize_x for exact dimensions) |
| size | int/list[int] | No | Target size; single int resizes shorter side, list sets both dimensions |
| mode | str | No | Resize mode: "not_smaller" preserves aspect ratio by matching the shorter side |
| interp_type | types.DALIInterpType | No | Interpolation method: INTERP_TRIANGULAR (bilinear with antialiasing), INTERP_LINEAR, INTERP_CUBIC |
| antialias | bool | No | Enable antialiasing filter during downscaling (default True) |
| horizontal | DataNode/bool | No | For fn.flip: whether to apply horizontal flip; typically from fn.random.coin_flip |
| automatic_augmentation | str | No | Augmentation policy: "autoaugment", "trivialaugment", or "disabled" |
Outputs
| Name | Type | Description |
|---|---|---|
| images | DataNode | Resized and augmented image tensor; shape [crop, crop, 3] for training or [H', W', 3] for validation before center crop |
| mirror | DataNode/bool | Random coin flip result (0 or 1) to be passed to fn.crop_mirror_normalize (ResNet50 pattern) |
Usage Examples
ResNet50 Training Resize and Flip
# After fn.decoders.image_random_crop, resize to exact crop dimensions:
images = fn.resize(
images,
resize_x=224,
resize_y=224,
interp_type=types.INTERP_TRIANGULAR,
)
mirror = fn.random.coin_flip(probability=0.5)
# mirror is passed to fn.crop_mirror_normalize later
EfficientNet Training with AutoAugment
from nvidia.dali.auto_aug import auto_augment
images = fn.resize(
images,
size=[224, 224],
interp_type=types.INTERP_LINEAR,
antialias=False,
)
images = images.gpu()
rng = fn.random.coin_flip(probability=0.5)
images = fn.flip(images, horizontal=rng)
images = auto_augment.auto_augment_image_net(images, shape=[224, 224])
Validation Resize (Aspect-Preserving)
# Resize shorter side to 256, then center crop to 224 in next step:
images = fn.resize(
images,
size=256,
mode="not_smaller",
interp_type=types.INTERP_TRIANGULAR,
)