Implementation:NVIDIA NeMo Curator ImageBatch
| Knowledge Sources | |
|---|---|
| Domains | Data Curation, Image Processing, Pipeline Tasks |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
The ImageBatch and ImageObject classes define the data structures for image processing tasks in the NeMo Curator pipeline, carrying images and their accumulated annotations through pipeline stages.
Description
This module defines two complementary dataclasses:
ImageObject represents a single image with its associated metadata and computed attributes:
- image_path (str): Path to the image file on disk.
- image_id (str): Unique identifier for the image.
- metadata (dict[str, Any]): Arbitrary metadata dictionary.
- image_data (np.ndarray or None): Raw pixel data as a numpy array in HWC RGB format (Height x Width x Channels).
- embedding (np.ndarray or None): Image embedding vector as a numpy array, typically produced by stages like CLIP embedding.
- aesthetic_score (float or None): Aesthetic quality score.
- nsfw_score (float or None): NSFW probability score.
ImageBatch extends Task with a list of ImageObject instances as its data. It provides:
- data (list[ImageObject]): The batch of image objects, defaulting to an empty list.
- num_items property: Returns the number of images via
len(self.data). - validate(): Currently a placeholder that always returns True (marked with a TODO for future implementation of image path existence checks).
The ImageObject fields accumulate annotations as images pass through pipeline stages -- for example, the embedding field is populated by an embedding stage, and aesthetic_score and nsfw_score are populated by classification stages.
Usage
Use ImageBatch and ImageObject when building image curation workflows. ImageBatch is the task type consumed and produced by image processing stages such as CLIP embedding, aesthetic scoring, NSFW filtering, and deduplication.
Code Reference
Source Location
- Repository: NeMo-Curator
- File:
nemo_curator/tasks/image.py - Lines: 1-69
Signature
@dataclass
class ImageObject:
image_path: str = ""
image_id: str = ""
metadata: dict[str, Any] = field(default_factory=dict)
image_data: np.ndarray | None = None
embedding: np.ndarray | None = None
aesthetic_score: float | None = None
nsfw_score: float | None = None
@dataclass
class ImageBatch(Task):
data: list[ImageObject] = field(default_factory=list)
def validate(self) -> bool: ...
@property
def num_items(self) -> int: ...
Import
from nemo_curator.tasks.image import ImageBatch, ImageObject
# or
from nemo_curator.tasks import ImageBatch, ImageObject
I/O Contract
ImageObject Fields
| Name | Type | Required | Description |
|---|---|---|---|
| image_path | str | No | Path to the image file on disk (default: empty string) |
| image_id | str | No | Unique identifier for the image (default: empty string) |
| metadata | dict[str, Any] | No | Arbitrary metadata dictionary (default: empty dict) |
| image_data | np.ndarray or None | No | Raw pixel data in HWC RGB format |
| embedding | np.ndarray or None | No | Embedding vector produced by embedding stages |
| aesthetic_score | float or None | No | Aesthetic quality score from classification stages |
| nsfw_score | float or None | No | NSFW probability score from classification stages |
ImageBatch Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | list[ImageObject] | No | List of image objects (default: empty list) |
| task_id | str | Yes | Unique identifier for this task (inherited from Task) |
| dataset_name | str | Yes | Name of the dataset this task belongs to (inherited from Task) |
ImageBatch Outputs
| Name | Type | Description |
|---|---|---|
| data | list[ImageObject] | The batch of image objects with accumulated annotations |
| num_items | int | Number of images in the batch |
| validate() | bool | Currently always returns True (placeholder) |
Usage Examples
Creating an ImageBatch
from nemo_curator.tasks.image import ImageBatch, ImageObject
# Create individual image objects
img1 = ImageObject(
image_path="/data/images/photo1.jpg",
image_id="img_001",
metadata={"source": "flickr", "resolution": "1024x768"},
)
img2 = ImageObject(
image_path="/data/images/photo2.jpg",
image_id="img_002",
metadata={"source": "flickr", "resolution": "800x600"},
)
# Create a batch
batch = ImageBatch(
task_id="image_task_001",
dataset_name="flickr_dataset",
data=[img1, img2],
)
print(batch.num_items) # 2
Accessing Image Annotations
# After pipeline stages have populated scores and embeddings
for img in batch.data:
if img.aesthetic_score is not None:
print(f"{img.image_id}: aesthetic={img.aesthetic_score:.2f}")
if img.nsfw_score is not None:
print(f"{img.image_id}: nsfw={img.nsfw_score:.2f}")
Related Pages
- Environment:NVIDIA_NeMo_Curator_Python_Linux_Base
- NVIDIA_NeMo_Curator_Task_Base - Abstract base class that ImageBatch extends
- NVIDIA_NeMo_Curator_DocumentBatch - Analogous task type for text documents
- NVIDIA_NeMo_Curator_AudioBatch - Analogous task type for audio data
- NVIDIA_NeMo_Curator_ImageEmbeddingStage - Stage that populates image embeddings
- NVIDIA_NeMo_Curator_ImageAestheticFilterStage - Stage that uses aesthetic scores
- NVIDIA_NeMo_Curator_ImageNSFWFilterStage - Stage that uses NSFW scores