Implementation:Snorkel team Snorkel LabelingFunction Init

Knowledge Sources	Snorkel Snorkel API Docs
Domains	Weak_Supervision, Data_Programming
Last Updated	2026-02-14 20:00 GMT

Overview

Concrete tool for defining labeling functions that encode domain heuristics as programmatic labelers, provided by the Snorkel library.

Description

The LabelingFunction class and its companion labeling_function decorator provide the primary interface for creating labeling functions in Snorkel. A labeling function wraps a user-defined Python function that takes a data point and returns an integer label or -1 for abstention.

The class supports:

Preprocessors: A chain of BasePreprocessor objects run before the LF logic
Resources: External data (dictionaries, models) injected via keyword arguments
NLP variant: NLPLabelingFunction adds automatic spaCy processing

Usage

Import this class when you need to define labeling functions for a weak supervision pipeline. Use the decorator form for simple LFs and the class form when you need explicit control over naming or resources.

Code Reference

Source Location

Repository: snorkel
File: snorkel/labeling/lf/core.py
Lines: L7-142

Signature

class LabelingFunction:
    def __init__(
        self,
        name: str,
        f: Callable[..., int],
        resources: Optional[Mapping[str, Any]] = None,
        pre: Optional[List[BasePreprocessor]] = None,
    ) -> None:
        """
        Args:
            name: Name of the LF (unique identifier).
            f: Function that implements the core LF logic,
               takes a DataPoint and returns an int label.
            resources: Labeling resources passed to f via kwargs.
            pre: Preprocessors to run on data points before LF execution.
        """

class labeling_function:
    def __init__(
        self,
        name: Optional[str] = None,
        resources: Optional[Mapping[str, Any]] = None,
        pre: Optional[List[BasePreprocessor]] = None,
    ) -> None: ...

    def __call__(self, f: Callable[..., int]) -> LabelingFunction: ...

Import

from snorkel.labeling import LabelingFunction, labeling_function

I/O Contract

Inputs

Name	Type	Required	Description
name	str	Yes	Unique identifier for the labeling function
f	Callable[..., int]	Yes	Core labeling logic; takes DataPoint, returns int label or -1 (abstain)
resources	Optional[Mapping[str, Any]]	No	External resources (dicts, models) passed as kwargs to f
pre	Optional[List[BasePreprocessor]]	No	Preprocessors to run before LF execution

Outputs

Name	Type	Description
LabelingFunction instance	LabelingFunction	Callable object that labels data points; returns int
__call__ result	int	Label for a data point (class index or -1 for abstain)

Usage Examples

Simple LF with Decorator

from snorkel.labeling import labeling_function

ABSTAIN = -1
SPAM = 1
HAM = 0

@labeling_function()
def lf_keyword_check(x):
    """Label as SPAM if 'buy now' appears in text."""
    return SPAM if "buy now" in x.text.lower() else ABSTAIN

# Apply to a data point
from types import SimpleNamespace
dp = SimpleNamespace(text="Buy now and save!")
label = lf_keyword_check(dp)  # Returns 1 (SPAM)

LF with Resources

from snorkel.labeling import LabelingFunction

known_spammers = {"spammer@example.com", "junk@test.com"}

def check_sender(x, known_spammers):
    return SPAM if x.sender in known_spammers else ABSTAIN

lf_known_sender = LabelingFunction(
    name="lf_known_sender",
    f=check_sender,
    resources={"known_spammers": known_spammers},
)

NLP LF with spaCy

from snorkel.labeling.lf.nlp import NLPLabelingFunction

def has_person_entity(x):
    """Check if spaCy found a PERSON entity."""
    for ent in x.doc.ents:
        if ent.label_ == "PERSON":
            return HAM
    return ABSTAIN

lf_person = NLPLabelingFunction(
    name="lf_person_entity",
    f=has_person_entity,
    text_field="text",
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment