Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Cleanlab Cleanlab Estimate Latent

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Data_Quality
Last Updated 2026-02-09 19:00 GMT

Overview

Concrete tool for deriving latent noise transition matrices and true label priors from a confident joint provided by the Cleanlab library.

Description

This function takes a confident joint matrix and an array of noisy labels and returns three estimated quantities: the latent true label prior py, the noise matrix P(given_label | true_label), and the inverse noise matrix P(true_label | given_label). It supports four different methods for estimating the true prior (py_method) and an optional iterative convergence mode that alternates between refining the noise matrices and the confident joint until estimates stabilize. The function normalizes the confident joint columns and rows to produce valid probability distributions for the noise and inverse noise matrices respectively.

Usage

Import and use this function after computing the confident joint (via compute_confident_joint) when you need to understand the full noise transition structure of your dataset. The noise matrix is useful for understanding systematic annotation errors, the inverse noise matrix is useful for correcting predictions at inference time, and the true prior is useful for understanding class imbalance after correcting for noise.

Code Reference

Source Location

  • Repository: cleanlab
  • File: cleanlab/count.py
  • Lines: 715-796

Signature

def estimate_latent(
    confident_joint,
    labels,
    *,
    py_method="cnt",
    converge_latent_estimates=False,
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]

Import

from cleanlab.count import estimate_latent

I/O Contract

Inputs

Name Type Required Description
confident_joint np.ndarray Yes The confident joint matrix of shape (K, K) as computed by compute_confident_joint. Entry (i, j) estimates the count of examples with given label i and true label j.
labels np.ndarray Yes Array of noisy class labels of shape (N,) with integer values in range 0..K-1. Used to compute the empirical label distribution.
py_method str No Method for estimating the true label prior. One of "cnt" (default, direct counting from the confident joint), "eqn" (equation-based), "marginal" (marginal distribution), or "marginal_ps" (marginal with prior shift).
converge_latent_estimates bool No If True, iteratively re-estimate the confident joint and noise matrices until convergence. Defaults to False.

Outputs

Name Type Description
py np.ndarray Array of shape (K,) representing the estimated latent prior distribution of true labels. Sums to 1.
noise_matrix np.ndarray true_label=j). Each column sums to 1.
inv_noise_matrix np.ndarray given_label=i). Each row sums to 1.

Usage Examples

Basic Usage

import numpy as np
from cleanlab.count import compute_confident_joint, estimate_latent

labels = np.array([0, 0, 1, 1, 2, 2, 0, 1, 2, 1])
pred_probs = np.array([
    [0.9, 0.05, 0.05],
    [0.3, 0.6, 0.1],
    [0.1, 0.8, 0.1],
    [0.05, 0.15, 0.8],
    [0.1, 0.1, 0.8],
    [0.05, 0.05, 0.9],
    [0.85, 0.1, 0.05],
    [0.1, 0.7, 0.2],
    [0.0, 0.2, 0.8],
    [0.15, 0.75, 0.1],
])

# Step 1: Compute the confident joint
cj = compute_confident_joint(labels, pred_probs)

# Step 2: Estimate latent noise matrices
py, noise_matrix, inv_noise_matrix = estimate_latent(cj, labels)

print("True label prior:", py)
print("Noise matrix (P(given|true)):\n", noise_matrix)
print("Inverse noise matrix (P(true|given)):\n", inv_noise_matrix)

With Convergence

from cleanlab.count import compute_confident_joint, estimate_latent

cj = compute_confident_joint(labels, pred_probs)

py, noise_matrix, inv_noise_matrix = estimate_latent(
    cj, labels,
    py_method="marginal",
    converge_latent_estimates=True,
)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment