Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Cleanlab Cleanlab Datalab Get Issues

From Leeroopedia


Field Value
Sources Cleanlab
Domains Data_Quality, Dataset_Auditing, Data_Access
Last Updated 2026-02-09 12:00 GMT

Overview

Datalab_Get_Issues is the method that provides programmatic access to per-example issue detection results as a structured pandas DataFrame.

Description

The Datalab.get_issues method retrieves the raw per-example issue detection results from the internal DataIssues container. When called without an issue_name, it returns the full DataFrame containing columns for every issue type that was checked. When called with a specific issue_name, it returns only the columns relevant to that issue type.

Before delegating to the internal DataIssues.get_issues(), the method validates that the requested issue_name is among the list of possible issue types. If an invalid issue name is provided, a ValueError is raised with a message listing the valid options.

Usage

Call this method on a Datalab instance after find_issues() has completed. Use the returned DataFrame to filter, sort, and export issue results programmatically.

Code Reference

Source Location

Repository
cleanlab/cleanlab
File
cleanlab/datalab/datalab.py
Lines
482

Signature

def get_issues(self, issue_name: Optional[str] = None) -> pd.DataFrame

Import

from cleanlab import Datalab
# get_issues is a method of the Datalab instance

I/O Contract

Inputs

Name Type Required Description
issue_name Optional[str] No (default: None) The type of issue to retrieve. Valid values include "label", "outlier", "duplicate", "non_iid", "null", "class_imbalance", "underperforming_group", "data_valuation", and others depending on the configured task. If None, returns the full DataFrame with all issue types.

Outputs

Name Type Description
return pd.DataFrame A DataFrame where each row corresponds to a dataset example. Columns include is_{issue}_issue (bool) indicating whether the example is flagged, and {issue}_score (float) indicating the severity (lower is worse, range 0-1). Additional columns may be present depending on the issue type. Scores are comparable across examples for the same issue type but not across different issue types.

Exceptions

Exception Condition
ValueError Raised if issue_name is not None and is not in the list of possible issue types.

Usage Examples

Get All Issues

from cleanlab import Datalab

lab = Datalab(data=my_data, label_name="label")
lab.find_issues(pred_probs=pred_probs, features=features)

# Get the full issues DataFrame
all_issues = lab.get_issues()
print(all_issues.columns.tolist())
# ['is_label_issue', 'label_score', 'is_outlier_issue', 'outlier_score', ...]

Get Issues for a Specific Type

# Get only label issues
label_issues = lab.get_issues("label")
print(label_issues.columns.tolist())
# ['is_label_issue', 'label_score', ...]

# Find the most problematic examples
worst_labels = label_issues.sort_values("label_score").head(10)
print(worst_labels)

Filter Flagged Examples

# Get all examples flagged as having label issues
label_issues = lab.get_issues("label")
flagged = label_issues[label_issues["is_label_issue"] == True]
print(f"Found {len(flagged)} examples with label issues")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment