Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Snorkel team Snorkel Generative Label Model Training

From Leeroopedia
Knowledge Sources
Domains Weak_Supervision, Graphical_Models, Matrix_Completion
Last Updated 2026-02-14 20:00 GMT

Overview

An algorithm that learns the accuracy parameters of noisy labeling functions from their agreement and disagreement patterns, without access to ground truth labels.

Description

Generative Label Model Training is the core algorithmic step in the data programming paradigm. Given a label matrix produced by multiple noisy labeling functions, the label model learns the conditional probability of each LF's output given the true (unobserved) label: P(λj|Y).

The key insight is that the agreement and disagreement patterns among labeling functions provide sufficient statistics to estimate their individual accuracies. This is possible because the LFs are assumed to be conditionally independent given the true label Y (or have a known dependency structure).

The Snorkel label model uses a matrix completion approach over the junction tree of the LF dependency graph. It computes the inverse generalized covariance matrix and performs optimization to recover the conditional LF probability parameters.

Training involves:

  • Computing the augmented label matrix (one-hot encoded votes)
  • Building a clique tree for the dependency structure
  • Optimizing a noise-aware loss function using SGD/Adam
  • Optionally aligning label classes using the Munkres algorithm

Usage

Use this principle after applying labeling functions and analyzing their quality. Train the label model when you have a sufficient set of labeling functions (typically 3+ with reasonable coverage) and want to combine their noisy votes into high-quality probabilistic labels.

Theoretical Basis

The generative model defines the joint distribution:

P(L,Y)=P(Y)j=1mP(λj|Y)

under the conditional independence assumption. The label model parameters μ encode:

μj,l,y=P(λj=l|Y=y)

Training minimizes the negative log marginal likelihood of the observed label matrix:

μ^=argminμi=1nlogP(Li;μ)

with optional L2 regularization and LF precision priors.

Pseudo-code:

# Abstract label model training
mu = initialize_parameters(n_lfs, cardinality, prec_init=0.7)
for epoch in range(n_epochs):
    L_aug = augment_label_matrix(L_train)
    loss = compute_loss(L_aug, mu) + l2 * regularization(mu)
    mu = optimizer_step(mu, loss)

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment