Implementation:Cleanlab Cleanlab Datalab Report

Field	Value
Sources	Cleanlab
Domains	Data_Quality, Dataset_Auditing, Reporting
Last Updated	2026-02-09 12:00 GMT

Overview

Datalab_Report is the method that generates and prints a human-readable summary of all detected dataset quality issues.

Description

The Datalab.report method constructs a Reporter object and delegates the report generation to it. The reporter aggregates the per-example issue results stored in DataIssues, sorts issue types by severity, and prints a formatted summary to stdout. The report includes the number of flagged examples per issue type, the top-N most problematic examples for each type, and optional descriptions explaining what each issue type means.

If no issues have been found (i.e., find_issues() has not been called or was called with no issue types), the method prints a message indicating that no issue types were specified and returns early.

The method uses a factory pattern (report_factory) to select the appropriate reporter implementation based on whether an image lab is present, allowing image-specific issue types to be reported alongside standard ones.

Usage

Call report() on a Datalab instance after find_issues() has been run. Adjust parameters to control the level of detail in the output.

Code Reference

Source Location

Repository: cleanlab/cleanlab
File: cleanlab/datalab/datalab.py
Lines: 355--363

Signature

def report(
    self,
    *,
    num_examples: int = 5,
    verbosity: Optional[int] = None,
    include_description: bool = True,
    show_summary_score: bool = False,
    show_all_issues: bool = False,
) -> None

Import

from cleanlab import Datalab
# report is a method of the Datalab instance

I/O Contract

Inputs

Name	Type	Required	Description
`num_examples`	`int`	No (default: `5`)	Number of the most problematic examples to show for each issue type.
`verbosity`	`Optional[int]`	No (default: instance verbosity)	Controls the amount of detail in the report. Valid values are 0 through 4. If `None`, uses the verbosity level set during Datalab initialization.
`include_description`	`bool`	No (default: `True`)	Whether to include a plain-language description of each issue type. Set to `False` once familiar with the issue types.
`show_summary_score`	`bool`	No (default: `False`)	Whether to display the overall severity score per issue type. These scores are not comparable across different issue types.
`show_all_issues`	`bool`	No (default: `False`)	Whether to show all issue types that were checked, including those with zero detected issues. By default, only issue types with at least one flagged example are shown.

Outputs

Name	Type	Description
return	`None`	The method prints the report to stdout. No value is returned.

Usage Examples

Basic Report

from cleanlab import Datalab

# After initialization and find_issues()
lab = Datalab(data=my_data, label_name="label")
lab.find_issues(pred_probs=pred_probs, features=features)

# Print the default report
lab.report()

Customized Report

# Show more examples, include severity scores, show all issue types
lab.report(
    num_examples=10,
    verbosity=3,
    include_description=False,
    show_summary_score=True,
    show_all_issues=True,
)

Minimal Report

# Minimal output: fewer examples, low verbosity
lab.report(num_examples=2, verbosity=0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment