Implementation:NVIDIA NeMo Curator AudioBatch

Knowledge Sources	NVIDIA NeMo Curator
Domains	Data Curation, Audio Processing, Pipeline Tasks
Last Updated	2026-02-14 00:00 GMT

Overview

The AudioBatch class defines the task type for processing batches of audio data in the NeMo Curator pipeline, storing audio records as a list of dictionaries with built-in file existence validation.

Description

AudioBatch extends Task[dict] and represents a batch of audio items, where each item is a dictionary containing audio metadata and file references. The class provides several key behaviors:

Data normalization: The constructor accepts a single dictionary, a list of dictionaries, or None. A single dictionary is automatically wrapped into a list for uniform handling.

File path validation: An optional filepath_key parameter specifies which dictionary key holds the audio file path. When set, the validate_item() method checks that the referenced file exists on disk using os.path.exists(). If a file is missing, a warning is logged via loguru and the item is considered invalid. The validate() method runs this check across all items using all().

Item counting: The num_items property returns the number of audio items in the batch by returning len(self.data).

The class inherits the standard Task infrastructure including task_id, dataset_name, _stage_perf performance tracking, _metadata, and _uuid fields.

Usage

Use AudioBatch when building audio data curation workflows. It serves as the data container for audio processing stages, enabling pipeline stages to receive and produce batches of audio records with automatic file existence validation.

Code Reference

Source Location

Repository: NeMo-Curator
File: nemo_curator/tasks/audio_batch.py
Lines: 1-57

Signature

@dataclass
class AudioBatch(Task[dict]):
    def __init__(
        self,
        data: dict | list[dict] | None = None,
        filepath_key: str | None = None,
        task_id: str = "",
        dataset_name: str = "",
        **kwargs,
    ): ...

    @property
    def num_items(self) -> int: ...
    def validate_item(self, item: dict) -> bool: ...
    def validate(self) -> bool: ...

Import

from nemo_curator.tasks.audio_batch import AudioBatch
# or
from nemo_curator.tasks import AudioBatch

I/O Contract

Inputs

Name	Type	Required	Description
data	dict, list[dict], or None	No	Audio item data; a single dict is normalized to a list
filepath_key	str or None	No	Dictionary key holding the audio file path for validation
task_id	str	No	Unique identifier for this task (default: empty string)
dataset_name	str	No	Name of the dataset this task belongs to (default: empty string)

Outputs

Name	Type	Description
data	list[dict]	List of audio item dictionaries
num_items	int	Number of audio items in the batch
validate()	bool	Whether all audio items pass file existence validation

Usage Examples

Basic Usage

from nemo_curator.tasks import AudioBatch

# Create a batch from a list of audio records
batch = AudioBatch(
    data=[
        {"audio_path": "/data/audio/clip1.wav", "duration": 5.2},
        {"audio_path": "/data/audio/clip2.wav", "duration": 3.1},
    ],
    filepath_key="audio_path",
    task_id="audio_task_001",
    dataset_name="speech_dataset",
)

print(batch.num_items)  # 2

Single Item Normalization

from nemo_curator.tasks import AudioBatch

# A single dict is automatically wrapped into a list
batch = AudioBatch(
    data={"audio_path": "/data/audio/clip1.wav", "duration": 5.2},
    filepath_key="audio_path",
)

print(batch.num_items)  # 1
print(type(batch.data))  # <class 'list'>

Related Pages

Environment:NVIDIA_NeMo_Curator_Python_Linux_Base
NVIDIA_NeMo_Curator_Task_Base - Abstract base class that AudioBatch extends
NVIDIA_NeMo_Curator_DocumentBatch - Analogous task type for text documents
NVIDIA_NeMo_Curator_ImageBatch - Analogous task type for images

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment