Implementation:NVIDIA DALI Fn External Source
| Knowledge Sources | |
|---|---|
| Domains | Image_Processing, GPU_Computing, Data_Ingestion |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete operator for injecting externally-provided data into a DALI pipeline graph, provided by the nvidia.dali.fn module.
Description
fn.external_source creates a named input node in the DALI pipeline graph that receives data from the host program at each iteration. The operator supports both CPU and GPU placement, zero-copy data transfer, blocking semantics for synchronization with dynamic execution, and explicit data type constraints.
When used with dynamic execution mode, data is fed to the pipeline via keyword arguments to pipe.run(name=data). When used with the DALI Proxy pattern, the external source is implicitly fed by the DALIServer which intercepts data from a PyTorch Dataset.__getitem__ call.
Key behaviors:
- name identifies the external source and is used as the keyword argument name when feeding data via pipe.run().
- no_copy=False causes DALI to copy the input data into its own buffer, which is safer when the source buffer may be reused.
- no_copy=True avoids the copy for better performance but requires the caller to guarantee buffer lifetime.
- blocking=True makes the operator wait for data, which is required in dynamic execution mode to prevent the scheduler from advancing past the source node.
- dtype specifies the expected data type for validation and correct interpretation.
Usage
Place fn.external_source at the beginning of a pipeline graph wherever data must be supplied from outside DALI. Feed data at runtime either via pipe.run(source_name=numpy_array) in dynamic mode or via the proxy mechanism in the DALIServer integration.
Code Reference
Source Location
- Repository: NVIDIA DALI
- File: docs/examples/zoo/images/decode.py (lines 33-39)
- File: docs/examples/zoo/images/decode_and_transform_pytorch.py (line 75)
Signature
fn.external_source(
device="cpu",
name=source_name,
no_copy=False,
blocking=True,
dtype=types.UINT8,
)
Import
import nvidia.dali.fn as fn
import nvidia.dali.types as types
# or
from nvidia.dali import fn, types
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | Identifier used as the keyword argument when feeding data via pipe.run() |
| device | str | No | Device placement for the output: "cpu" (default) or "gpu" |
| no_copy | bool | No | If True, avoids copying input data (caller must guarantee buffer lifetime). Default: False |
| blocking | bool | No | If True, the operator blocks until data is available. Required for dynamic execution. Default: True |
| dtype | types.DALIDataType | No | Expected data type of the input (e.g., types.UINT8). Default: None (inferred) |
Outputs
| Name | Type | Description |
|---|---|---|
| output | DataNode | DALI DataNode containing the externally-provided data, ready for downstream operators in the pipeline graph |
Usage Examples
Example: Dynamic Mode with Named Feed
import numpy as np
from nvidia.dali.pipeline import pipeline_def
import nvidia.dali.fn as fn
import nvidia.dali.types as types
@pipeline_def(batch_size=4, num_threads=4, device_id=0, exec_dynamic=True)
def decode_pipeline(source_name):
inputs = fn.external_source(
device="cpu",
name=source_name,
no_copy=False,
blocking=True,
dtype=types.UINT8,
)
decoded = fn.decoders.image(inputs, device="mixed", output_type=types.RGB)
return decoded
pipe = decode_pipeline("encoded_img", prefetch_queue_depth=1)
pipe.build()
# Feed one batch of encoded image bytes per iteration
encoded = np.fromfile("image.jpg", dtype=np.uint8)
result = pipe.run(encoded_img=np.expand_dims(encoded, axis=0))
Example: Zero-Copy External Source with Proxy
from nvidia.dali import pipeline_def, fn, types
@pipeline_def
def image_pipe(img_hw=(320, 200)):
encoded_images = fn.external_source(name="images", no_copy=True)
decoded = fn.decoders.image(
encoded_images,
device="mixed",
output_type=types.RGB,
)
images = fn.resize(decoded, size=img_hw, interp_type=types.INTERP_LINEAR)
return images