Implementation:NVIDIA DALI Fn Decoders Image Random Crop
| Knowledge Sources | |
|---|---|
| Domains | Data_Pipeline, GPU_Computing, Image_Processing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The fn.decoders.image_random_crop operator in NVIDIA DALI that performs fused JPEG decoding and random area cropping, leveraging the nvJPEG hardware decoder in mixed (CPU+GPU) mode to decode only the pixels within the randomly sampled crop region.
Description
fn.decoders.image_random_crop combines JPEG decoding with RandomResizedCrop-style random cropping in a single fused operation. By knowing the crop region before decoding begins, the operator can skip JPEG blocks that fall entirely outside the crop window, significantly reducing both compute and memory bandwidth compared to separate decode-then-crop approaches.
The operator supports mixed device mode (default for GPU training) where entropy decoding runs on the CPU and the IDCT/color conversion runs on the GPU via nvJPEG. This hybrid approach maximizes throughput across heterogeneous hardware. A cpu device mode is also available for systems without GPU JPEG decoding support.
The random crop is controlled by random_aspect_ratio (range of allowed width/height ratios) and random_area (range of allowed crop area as a fraction of the original image). The operator attempts up to num_attempts random crops satisfying both constraints before falling back to a center crop.
Memory preallocation hints (preallocate_width_hint and preallocate_height_hint) prevent runtime GPU memory reallocations when processing datasets with variable image dimensions. These hints should be set to the maximum expected image dimensions in the dataset.
Usage
Use this operator inside a DALI pipeline definition to decode JPEG images during training. For validation, use the non-cropping variant fn.decoders.image instead, since validation requires deterministic center cropping.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/use_cases/pytorch/resnet50/main.py (lines 124-133)
- File: docs/examples/use_cases/pytorch/efficientnet/image_classification/dali.py (lines 35-41)
Signature (ResNet50 example)
images = fn.decoders.image_random_crop(
images,
device=decoder_device,
output_type=types.RGB,
preallocate_width_hint=preallocate_width_hint,
preallocate_height_hint=preallocate_height_hint,
random_aspect_ratio=[0.8, 1.25],
random_area=[0.1, 1.0],
num_attempts=100,
)
Signature (EfficientNet example)
images = fn.decoders.image_random_crop(
jpegs_input,
device=decoder_device,
output_type=types.RGB,
random_aspect_ratio=[0.75, 4.0 / 3.0],
random_area=[0.08, 1.0],
)
Import
import nvidia.dali.fn as fn
import nvidia.dali.types as types
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| images | DataNode (CPU) | Yes | Encoded JPEG byte buffers from fn.readers.file or similar source operator |
| device | str | No | Device for decoding: "mixed" for hybrid CPU+GPU (default), "cpu" for CPU-only |
| output_type | types.DALIImageType | No | Output color space; typically types.RGB for classification models |
| random_aspect_ratio | list[float] | No | [min, max] range for the random crop's aspect ratio (default [0.75, 1.333]) |
| random_area | list[float] | No | [min, max] range for the crop area as a fraction of the original image area (default [0.08, 1.0]) |
| num_attempts | int | No | Maximum number of attempts to find a valid random crop before falling back to center crop (default 10) |
| preallocate_width_hint | int | No | Expected maximum image width for GPU memory preallocation (0 disables preallocation) |
| preallocate_height_hint | int | No | Expected maximum image height for GPU memory preallocation (0 disables preallocation) |
Outputs
| Name | Type | Description |
|---|---|---|
| images | DataNode (GPU or CPU) | Decoded RGB image tensor with random crop applied; shape [H, W, 3] where H and W are the crop dimensions. On GPU when device="mixed". |
Usage Examples
Training Decoder with Preallocation Hints
# From ResNet50 main.py (lines 120-133):
preallocate_width_hint = 5980 if decoder_device == "mixed" else 0
preallocate_height_hint = 6430 if decoder_device == "mixed" else 0
images = fn.decoders.image_random_crop(
images,
device=decoder_device,
output_type=types.RGB,
preallocate_width_hint=preallocate_width_hint,
preallocate_height_hint=preallocate_height_hint,
random_aspect_ratio=[0.8, 1.25],
random_area=[0.1, 1.0],
num_attempts=100,
)
EfficientNet Decoder with Wider Aspect Ratios
# From EfficientNet dali.py (lines 35-41):
images = fn.decoders.image_random_crop(
jpegs_input,
device=decoder_device,
output_type=types.RGB,
random_aspect_ratio=[0.75, 4.0 / 3.0],
random_area=[0.08, 1.0],
)
Validation Decoder (Non-Cropping Variant)
# For validation, use fn.decoders.image instead:
images = fn.decoders.image(
images,
device=decoder_device,
output_type=types.RGB,
)