Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer VideoAestheticsFilter

From Leeroopedia
Revision as of 12:23, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Datajuicer_Data_juicer_VideoAestheticsFilter.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Data_Quality, Filtering
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for filtering data samples based on video frame aesthetics scores provided by Data-Juicer.

Description

VideoAestheticsFilter is a filter operator that keeps samples where the aesthetics scores of sampled video frames fall within a specified range. It extends Filter and uses the two-phase compute_stats/process pattern. It extracts frames from videos using either uniform sampling or keyframe extraction, then scores each frame using a HuggingFace aesthetics predictor model (default: shunk031/aesthetics-predictor-v2-sac-logos-ava1-l14-linearMSE). Per-video scores are reduced via 'avg', 'max', or 'min' across frames. Results are cached under video_frames_aesthetics_score. Supports 'any'/'all' strategy, CUDA acceleration, and operator fusion for shared frame sampling and video loading.

Usage

Import when filtering based on video aesthetics quality. Configure in YAML or Python.

Code Reference

Source Location

Signature

@OPERATORS.register_module("video_aesthetics_filter")
class VideoAestheticsFilter(Filter):
    def __init__(self, hf_scorer_model: str = "", trust_remote_code: bool = False, min_score: float = 0.4, max_score: float = 1.0, frame_field: Optional[str] = None, frame_sampling_method: str = "uniform", frame_num: PositiveInt = 3, any_or_all: str = "any", reduce_mode: str = "avg", *args, **kwargs):

Import

from data_juicer.ops.filter.video_aesthetics_filter import VideoAestheticsFilter

I/O Contract

Inputs

Name Type Required Description
hf_scorer_model str No HuggingFace aesthetics model name (default: shunk031/aesthetics-predictor-v2-sac-logos-ava1-l14-linearMSE)
min_score float No Minimum aesthetics score (default: 0.4)
max_score float No Maximum aesthetics score (default: 1.0)
frame_sampling_method str No Frame sampling method: "all_keyframes" or "uniform" (default: "uniform")
frame_num PositiveInt No Number of frames to extract uniformly (default: 3)
any_or_all str No Keep strategy: "any" or "all" (default: "any")
reduce_mode str No Score reduction: "avg", "max", or "min" (default: "avg")

Outputs

Name Type Description
samples Dict Filtered samples with video_frames_aesthetics_score stat computed

Usage Examples

YAML Configuration

process:
  - video_aesthetics_filter:
      min_score: 0.4
      max_score: 1.0
      frame_sampling_method: uniform
      frame_num: 3

Python API

from data_juicer.ops.filter.video_aesthetics_filter import VideoAestheticsFilter
op = VideoAestheticsFilter(min_score=0.4, max_score=1.0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment