Implementation:Snorkel team Snorkel LabelingFunction Init
| Knowledge Sources | |
|---|---|
| Domains | Weak_Supervision, Data_Programming |
| Last Updated | 2026-02-14 20:00 GMT |
Overview
Concrete tool for defining labeling functions that encode domain heuristics as programmatic labelers, provided by the Snorkel library.
Description
The LabelingFunction class and its companion labeling_function decorator provide the primary interface for creating labeling functions in Snorkel. A labeling function wraps a user-defined Python function that takes a data point and returns an integer label or -1 for abstention.
The class supports:
- Preprocessors: A chain of BasePreprocessor objects run before the LF logic
- Resources: External data (dictionaries, models) injected via keyword arguments
- NLP variant: NLPLabelingFunction adds automatic spaCy processing
Usage
Import this class when you need to define labeling functions for a weak supervision pipeline. Use the decorator form for simple LFs and the class form when you need explicit control over naming or resources.
Code Reference
Source Location
- Repository: snorkel
- File: snorkel/labeling/lf/core.py
- Lines: L7-142
Signature
class LabelingFunction:
def __init__(
self,
name: str,
f: Callable[..., int],
resources: Optional[Mapping[str, Any]] = None,
pre: Optional[List[BasePreprocessor]] = None,
) -> None:
"""
Args:
name: Name of the LF (unique identifier).
f: Function that implements the core LF logic,
takes a DataPoint and returns an int label.
resources: Labeling resources passed to f via kwargs.
pre: Preprocessors to run on data points before LF execution.
"""
class labeling_function:
def __init__(
self,
name: Optional[str] = None,
resources: Optional[Mapping[str, Any]] = None,
pre: Optional[List[BasePreprocessor]] = None,
) -> None: ...
def __call__(self, f: Callable[..., int]) -> LabelingFunction: ...
Import
from snorkel.labeling import LabelingFunction, labeling_function
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | Unique identifier for the labeling function |
| f | Callable[..., int] | Yes | Core labeling logic; takes DataPoint, returns int label or -1 (abstain) |
| resources | Optional[Mapping[str, Any]] | No | External resources (dicts, models) passed as kwargs to f |
| pre | Optional[List[BasePreprocessor]] | No | Preprocessors to run before LF execution |
Outputs
| Name | Type | Description |
|---|---|---|
| LabelingFunction instance | LabelingFunction | Callable object that labels data points; returns int |
| __call__ result | int | Label for a data point (class index or -1 for abstain) |
Usage Examples
Simple LF with Decorator
from snorkel.labeling import labeling_function
ABSTAIN = -1
SPAM = 1
HAM = 0
@labeling_function()
def lf_keyword_check(x):
"""Label as SPAM if 'buy now' appears in text."""
return SPAM if "buy now" in x.text.lower() else ABSTAIN
# Apply to a data point
from types import SimpleNamespace
dp = SimpleNamespace(text="Buy now and save!")
label = lf_keyword_check(dp) # Returns 1 (SPAM)
LF with Resources
from snorkel.labeling import LabelingFunction
known_spammers = {"spammer@example.com", "junk@test.com"}
def check_sender(x, known_spammers):
return SPAM if x.sender in known_spammers else ABSTAIN
lf_known_sender = LabelingFunction(
name="lf_known_sender",
f=check_sender,
resources={"known_spammers": known_spammers},
)
NLP LF with spaCy
from snorkel.labeling.lf.nlp import NLPLabelingFunction
def has_person_entity(x):
"""Check if spaCy found a PERSON entity."""
for ent in x.doc.ents:
if ent.label_ == "PERSON":
return HAM
return ABSTAIN
lf_person = NLPLabelingFunction(
name="lf_person_entity",
f=has_person_entity,
text_field="text",
)