Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Cleanlab Cleanlab TC Display Issues

From Leeroopedia


API token_classification.summary.display_issues and token_classification.summary.common_label_issues
Source cleanlab/token_classification/summary.py:L13-22, L139-149
Domains Machine_Learning, Data_Quality, NLP
Last Updated 2026-02-09

Overview

Implementation of token-level issue visualization and error pattern summarization for token classification tasks. Provides two functions: display_issues for rendering highlighted sentences and common_label_issues for aggregating error patterns across the dataset.

Description

This module provides two complementary functions for reviewing token classification label issues:

display_issues: Prints sentences with problematic tokens highlighted in color. For each flagged token, it optionally shows the given label and the model's predicted label. The function displays the top N most problematic sentences and supports excluding specific (sentence, token) pairs from display.

common_label_issues: Aggregates all detected label issues into a frequency table showing how often each type of label error occurs (e.g., "B-PER mislabeled as O" appearing 47 times). Returns a DataFrame sorted by frequency for systematic pattern analysis.

Usage

These functions are used after detecting token-level label issues with find_label_issues. They are typically called in a Jupyter notebook environment for interactive review.

Code Reference

Source Location

cleanlab/token_classification/summary.py, lines 13-22 (display_issues) and lines 139-149 (common_label_issues).

Signature

def display_issues(
    issues: list,
    tokens: List[List[str]],
    *,
    labels: Optional[list] = None,
    pred_probs: Optional[list] = None,
    exclude: List[Tuple[int, int]] = [],
    class_names: Optional[List[str]] = None,
    top: int = 20,
) -> None

def common_label_issues(
    issues: List[Tuple[int, int]],
    tokens: List[List[str]],
    *,
    labels: Optional[list] = None,
    pred_probs: Optional[list] = None,
    class_names: Optional[List[str]] = None,
    top: int = 10,
    exclude: List[Tuple[int, int]] = [],
    verbose: bool = True,
) -> pd.DataFrame

Import

from cleanlab.token_classification.summary import display_issues, common_label_issues

I/O Contract

display_issues Inputs

Parameter Type Description
issues list List of (sentence_index, token_index) tuples identifying tokens with label issues, as returned by find_label_issues.
tokens List[List[str]] List of N lists, each containing the string tokens for the corresponding sentence.
labels Optional[list] List of N lists of integer class labels. When provided, the given label is shown for each flagged token.
pred_probs Optional[list] List of N numpy arrays of shape (T_i, K). When provided, the predicted label is shown for each flagged token.
exclude List[Tuple[int, int]] List of (sentence_index, token_index) tuples to exclude from display. Defaults to empty list.
class_names Optional[List[str]] List of human-readable class names. When provided, class names are shown instead of integer indices.
top int Maximum number of sentences to display. Defaults to 20.

display_issues Output

Type Description
None Prints highlighted sentences to standard output. Does not return a value.

common_label_issues Inputs

Parameter Type Description
issues List[Tuple[int, int]] List of (sentence_index, token_index) tuples identifying tokens with label issues.
tokens List[List[str]] List of N lists, each containing the string tokens for the corresponding sentence.
labels Optional[list] List of N lists of integer class labels.
pred_probs Optional[list] List of N numpy arrays of shape (T_i, K).
class_names Optional[List[str]] List of human-readable class names.
top int Maximum number of common issue patterns to return. Defaults to 10.
exclude List[Tuple[int, int]] List of (sentence_index, token_index) tuples to exclude from analysis.
verbose bool If True, prints the results in addition to returning them. Defaults to True.

common_label_issues Output

Type Description
pd.DataFrame DataFrame summarizing the most common label error patterns. Columns include the given label, predicted label, token examples, and frequency count. Sorted by frequency (most common first).

Usage Examples

import numpy as np
from cleanlab.token_classification.filter import find_label_issues
from cleanlab.token_classification.summary import display_issues, common_label_issues

# Labels and predictions for a NER dataset
labels = [
    [0, 1, 2, 0],
    [0, 0, 1, 0, 0],
    [1, 2, 0],
]

pred_probs = [
    np.array([
        [0.9, 0.05, 0.05],
        [0.1, 0.8, 0.1],
        [0.1, 0.1, 0.8],
        [0.85, 0.1, 0.05],
    ]),
    np.array([
        [0.95, 0.03, 0.02],
        [0.88, 0.07, 0.05],
        [0.3, 0.4, 0.3],
        [0.9, 0.05, 0.05],
        [0.92, 0.04, 0.04],
    ]),
    np.array([
        [0.15, 0.75, 0.1],
        [0.1, 0.2, 0.7],
        [0.8, 0.1, 0.1],
    ]),
]

tokens = [
    ["John", "lives", "in", "Paris"],
    ["The", "weather", "is", "nice", "today"],
    ["Alice", "Smith", "left"],
]

class_names = ["O", "B-PER", "I-PER"]

# Find issues
issues = find_label_issues(labels, pred_probs)

# Display highlighted sentences with flagged tokens
display_issues(
    issues,
    tokens,
    labels=labels,
    pred_probs=pred_probs,
    class_names=class_names,
    top=10,
)

# Get summary of common error patterns
common_issues_df = common_label_issues(
    issues,
    tokens,
    labels=labels,
    pred_probs=pred_probs,
    class_names=class_names,
    top=5,
)
# Returns DataFrame with columns like: given_label, predicted_label, count

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment