Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:NVIDIA NeMo Curator AudioBatch

From Leeroopedia
Revision as of 13:19, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/NVIDIA_NeMo_Curator_AudioBatch.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Data Curation, Audio Processing, Pipeline Tasks
Last Updated 2026-02-14 00:00 GMT

Overview

The AudioBatch class defines the task type for processing batches of audio data in the NeMo Curator pipeline, storing audio records as a list of dictionaries with built-in file existence validation.

Description

AudioBatch extends Task[dict] and represents a batch of audio items, where each item is a dictionary containing audio metadata and file references. The class provides several key behaviors:

Data normalization: The constructor accepts a single dictionary, a list of dictionaries, or None. A single dictionary is automatically wrapped into a list for uniform handling.

File path validation: An optional filepath_key parameter specifies which dictionary key holds the audio file path. When set, the validate_item() method checks that the referenced file exists on disk using os.path.exists(). If a file is missing, a warning is logged via loguru and the item is considered invalid. The validate() method runs this check across all items using all().

Item counting: The num_items property returns the number of audio items in the batch by returning len(self.data).

The class inherits the standard Task infrastructure including task_id, dataset_name, _stage_perf performance tracking, _metadata, and _uuid fields.

Usage

Use AudioBatch when building audio data curation workflows. It serves as the data container for audio processing stages, enabling pipeline stages to receive and produce batches of audio records with automatic file existence validation.

Code Reference

Source Location

  • Repository: NeMo-Curator
  • File: nemo_curator/tasks/audio_batch.py
  • Lines: 1-57

Signature

@dataclass
class AudioBatch(Task[dict]):
    def __init__(
        self,
        data: dict | list[dict] | None = None,
        filepath_key: str | None = None,
        task_id: str = "",
        dataset_name: str = "",
        **kwargs,
    ): ...

    @property
    def num_items(self) -> int: ...
    def validate_item(self, item: dict) -> bool: ...
    def validate(self) -> bool: ...

Import

from nemo_curator.tasks.audio_batch import AudioBatch
# or
from nemo_curator.tasks import AudioBatch

I/O Contract

Inputs

Name Type Required Description
data dict, list[dict], or None No Audio item data; a single dict is normalized to a list
filepath_key str or None No Dictionary key holding the audio file path for validation
task_id str No Unique identifier for this task (default: empty string)
dataset_name str No Name of the dataset this task belongs to (default: empty string)

Outputs

Name Type Description
data list[dict] List of audio item dictionaries
num_items int Number of audio items in the batch
validate() bool Whether all audio items pass file existence validation

Usage Examples

Basic Usage

from nemo_curator.tasks import AudioBatch

# Create a batch from a list of audio records
batch = AudioBatch(
    data=[
        {"audio_path": "/data/audio/clip1.wav", "duration": 5.2},
        {"audio_path": "/data/audio/clip2.wav", "duration": 3.1},
    ],
    filepath_key="audio_path",
    task_id="audio_task_001",
    dataset_name="speech_dataset",
)

print(batch.num_items)  # 2

Single Item Normalization

from nemo_curator.tasks import AudioBatch

# A single dict is automatically wrapped into a list
batch = AudioBatch(
    data={"audio_path": "/data/audio/clip1.wav", "duration": 5.2},
    filepath_key="audio_path",
)

print(batch.num_items)  # 1
print(type(batch.data))  # <class 'list'>

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment