Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:NVIDIA NeMo Curator AestheticScorer

From Leeroopedia
Knowledge Sources
Domains Machine Learning, Computer Vision, Content Scoring
Last Updated 2026-02-14 00:00 GMT

Overview

Provides an aesthetic quality scorer that predicts visual quality scores from CLIP embeddings using a pre-trained MLP model.

Description

The aesthetics module contains two classes:

MLP is a 5-layer feedforward neural network (768 -> 1024 -> 128 -> 64 -> 16 -> 1) with dropout layers (0.2, 0.2, 0.1) between the first three linear layers. It takes 768-dimensional CLIP embedding vectors as input and outputs a single aesthetic score per sample. The forward pass runs under torch.no_grad() for inference efficiency.

AestheticScorer implements ModelInterface and provides the public interface for aesthetic scoring. It loads pre-trained weights from HuggingFace (ttj/sac-logos-ava1-l14-linearMSE, revision 1e77fa0) in safetensors format. The model automatically selects CUDA if available, falling back to CPU. On __call__, it accepts embeddings as either a torch.Tensor or numpy.ndarray, converts numpy arrays to tensors, moves them to the appropriate device, and returns per-sample aesthetic scores.

Usage

Use AestheticScorer when you need to evaluate the visual quality of images or video frames in a curation pipeline. It is typically chained with CLIP embedding extraction (e.g., via CLIPAestheticScorer in clip.py) to enable automated filtering of content by visual quality.

Code Reference

Source Location

  • Repository: NeMo-Curator
  • File: nemo_curator/models/aesthetics.py
  • Lines: 1-139

Signature

class MLP(nn.Module):
    def __init__(self) -> None: ...
    def forward(self, embed: torch.Tensor) -> torch.Tensor: ...

class AestheticScorer(ModelInterface):
    def __init__(self, model_dir: str) -> None: ...
    @property
    def model_id_names(self) -> list[str]: ...
    def setup(self) -> None: ...
    def get_weights_path(self) -> str: ...
    def __call__(self, embeddings: torch.Tensor | npt.NDArray[np.float32]) -> torch.Tensor: ...
    @classmethod
    def download_weights_on_node(cls, model_dir: str) -> None: ...

Import

from nemo_curator.models.aesthetics import AestheticScorer

I/O Contract

Inputs (Constructor)

Name Type Required Description
model_dir str Yes Path to the directory where model weights are stored or will be downloaded

Inputs (__call__)

Name Type Required Description
embeddings torch.Tensor or numpy.ndarray Yes CLIP embeddings with shape (batch_size, 768) as a torch tensor or numpy array

Outputs

Name Type Description
scores torch.Tensor Per-sample aesthetic scores with shape (batch_size,)

Model Architecture

Layer Configuration
Linear 768 -> 1024
Dropout p=0.2
Linear 1024 -> 128
Dropout p=0.2
Linear 128 -> 64
Dropout p=0.1
Linear 64 -> 16
Linear 16 -> 1

Pre-trained model: ttj/sac-logos-ava1-l14-linearMSE (HuggingFace, safetensors format)

Usage Examples

Basic Usage

from nemo_curator.models.aesthetics import AestheticScorer
import torch

# Download weights first
AestheticScorer.download_weights_on_node("/path/to/models")

# Initialize and setup
scorer = AestheticScorer(model_dir="/path/to/models")
scorer.setup()

# Score CLIP embeddings
embeddings = torch.randn(10, 768)  # batch of 10 CLIP embeddings
scores = scorer(embeddings)
print(scores.shape)  # torch.Size([10])

Usage with NumPy Arrays

import numpy as np
from nemo_curator.models.aesthetics import AestheticScorer

scorer = AestheticScorer(model_dir="/path/to/models")
scorer.setup()

# Also accepts numpy arrays
embeddings_np = np.random.randn(5, 768).astype(np.float32)
scores = scorer(embeddings_np)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment