Implementation:Cleanlab Cleanlab Datalab Report
| Field | Value |
|---|---|
| Sources | Cleanlab |
| Domains | Data_Quality, Dataset_Auditing, Reporting |
| Last Updated | 2026-02-09 12:00 GMT |
Overview
Datalab_Report is the method that generates and prints a human-readable summary of all detected dataset quality issues.
Description
The Datalab.report method constructs a Reporter object and delegates the report generation to it. The reporter aggregates the per-example issue results stored in DataIssues, sorts issue types by severity, and prints a formatted summary to stdout. The report includes the number of flagged examples per issue type, the top-N most problematic examples for each type, and optional descriptions explaining what each issue type means.
If no issues have been found (i.e., find_issues() has not been called or was called with no issue types), the method prints a message indicating that no issue types were specified and returns early.
The method uses a factory pattern (report_factory) to select the appropriate reporter implementation based on whether an image lab is present, allowing image-specific issue types to be reported alongside standard ones.
Usage
Call report() on a Datalab instance after find_issues() has been run. Adjust parameters to control the level of detail in the output.
Code Reference
Source Location
- Repository
cleanlab/cleanlab- File
cleanlab/datalab/datalab.py- Lines
- 355--363
Signature
def report(
self,
*,
num_examples: int = 5,
verbosity: Optional[int] = None,
include_description: bool = True,
show_summary_score: bool = False,
show_all_issues: bool = False,
) -> None
Import
from cleanlab import Datalab
# report is a method of the Datalab instance
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
num_examples |
int |
No (default: 5) |
Number of the most problematic examples to show for each issue type. |
verbosity |
Optional[int] |
No (default: instance verbosity) | Controls the amount of detail in the report. Valid values are 0 through 4. If None, uses the verbosity level set during Datalab initialization.
|
include_description |
bool |
No (default: True) |
Whether to include a plain-language description of each issue type. Set to False once familiar with the issue types.
|
show_summary_score |
bool |
No (default: False) |
Whether to display the overall severity score per issue type. These scores are not comparable across different issue types. |
show_all_issues |
bool |
No (default: False) |
Whether to show all issue types that were checked, including those with zero detected issues. By default, only issue types with at least one flagged example are shown. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | None |
The method prints the report to stdout. No value is returned. |
Usage Examples
Basic Report
from cleanlab import Datalab
# After initialization and find_issues()
lab = Datalab(data=my_data, label_name="label")
lab.find_issues(pred_probs=pred_probs, features=features)
# Print the default report
lab.report()
Customized Report
# Show more examples, include severity scores, show all issue types
lab.report(
num_examples=10,
verbosity=3,
include_description=False,
show_summary_score=True,
show_all_issues=True,
)
Minimal Report
# Minimal output: fewer examples, low verbosity
lab.report(num_examples=2, verbosity=0)