Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Cleanlab Cleanlab Image Quality Dependencies

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Data_Centric_AI
Last Updated 2026-02-09 19:30 GMT

Overview

Optional dependency environment extending the Datalab install with CleanVision for automated image quality checks within cleanlab dataset audits.

Description

The image quality integration allows Datalab to detect low-quality images (blurry, dark, duplicated, etc.) alongside other data quality issues. It requires the CleanVision library (>= 0.3.6) which is an independent image quality package. This functionality is only available when using HuggingFace `datasets.Dataset` objects (not plain DataFrames or dicts).

Usage

Use this environment when running Datalab on image datasets where you want automated image quality checks (blur, darkness, duplicates). Install via the `[image]` extras group. This extends the Datalab dependency environment.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Same as core cleanlab
Hardware CPU No GPU required for image quality checks
Python >= 3.10 Same as core cleanlab
Disk Moderate Image datasets can be large

Dependencies

System Packages

No additional system-level packages required beyond core cleanlab.

Python Packages

Credentials

No credentials required.

Quick Install

pip install 'cleanlab[image]'

Code Evidence

CleanVision import gating from `cleanlab/datalab/internal/adapter/imagelab.py:49-64`:

try:
    from cleanvision import Imagelab
    from datasets.arrow_dataset import Dataset

    if isinstance(dataset, Dataset):
        imagelab = Imagelab(hf_dataset=dataset, image_key=image_key)
    else:
        raise ValueError(
            "For now, only huggingface datasets are supported for running "
            "cleanvision checks inside cleanlab."
        )
except ImportError:
    raise ImportError(
        "Cannot import required image packages. Please install them via: "
        "`pip install cleanlab[image]` or just install cleanlab with "
        "all optional dependencies via: `pip install cleanlab[all]`"
    )

Optional dependency definition from `setup.py:30`:

IMAGE_REQUIRE = DATALAB_REQUIRE + ["cleanvision>=0.3.6"]

Common Errors

Error Message Cause Solution
`Cannot import required image packages` CleanVision or datasets not installed `pip install 'cleanlab[image]'`
`only huggingface datasets are supported for running cleanvision checks` Passed a DataFrame or dict instead of HuggingFace Dataset Convert data to `datasets.Dataset` format first
`cleanvision is required for correlation visualization` CleanVision not installed but spurious correlation visualization attempted `pip install 'cleanlab[image]'`

Compatibility Notes

  • HuggingFace datasets only: CleanVision integration currently only supports `datasets.arrow_dataset.Dataset` objects, not pandas DataFrames or plain dicts.
  • cleanvision < 0.3.6: Not supported. May cause API incompatibilities.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment