Implementation:Neuml Txtai Labels
| Knowledge Sources | |
|---|---|
| Domains | Text_Classification, NLP |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Zero-shot and fixed-label text classification pipeline provided by the txtai library.
Description
The Labels class wraps Hugging Face Transformers pipelines for text classification. It supports two modes: dynamic (zero-shot classification) where arbitrary label strings are provided at inference time, and fixed (standard text classification) where the model's pre-trained label set is used. The class inherits from HFPipeline, which handles model loading, device placement, and optional quantization.
When dynamic=True (the default), the underlying Hugging Face zero-shot-classification pipeline is used, which evaluates arbitrary candidate labels against the input text using a natural language inference (NLI) model. When dynamic=False, a standard text-classification pipeline is loaded, and the model's trained labels are used directly.
The __call__ method processes results through the outputs method, which handles both dynamic and fixed label result formats, supports flattening to label lists, and applies score thresholding. The limit method filters fixed-classification results using a provided label list, performing case-insensitive and numeric lookups against the model's id2label / label2id configuration.
Usage
Use this pipeline to classify text into categories. For zero-shot classification, provide a list of candidate labels at call time. For fixed classification, load a pre-trained classification model with dynamic=False and optionally filter results to a subset of trained labels. The flatten parameter simplifies output to label strings instead of (id, score) tuples, and supports a float threshold to filter low-confidence labels.
Code Reference
Source Location
- Repository: txtai
- File:
src/python/txtai/pipeline/text/labels.py - Lines: L1-137
Class Definition
class Labels(HFPipeline):
"""
Applies a text classifier to text. Supports zero shot and standard text classification models
"""
Constructor Signature
def __init__(self, path=None, quantize=False, gpu=True, model=None, dynamic=True, **kwargs):
The constructor delegates to HFPipeline.__init__ with the task set to "zero-shot-classification" when dynamic=True or "text-classification" when dynamic=False. The self.dynamic attribute stores the classification mode for use in __call__.
Call Signature
def __call__(self, text, labels=None, multilabel=False, flatten=None, workers=0, **kwargs):
Import
from txtai.pipeline import Labels
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| text | str or list | Yes | Input text or list of texts to classify. |
| labels | list of str or None | Conditional | List of candidate labels. Required for zero-shot mode (dynamic=True). Optional for fixed classification mode, where it filters results to a subset of trained labels. Pass None to return all trained labels.
|
| multilabel | bool or None | No | Controls score normalization. True: labels scored independently (sigmoid for fixed, multi_label for zero-shot). False: scores normalized to sum to 1 (softmax for fixed). None: raw scores returned. Defaults to False.
|
| flatten | bool, float, or None | No | When truthy, output is flattened to label strings instead of (id, score) tuples. If True, returns only the top label. If a float, returns all labels with score >= that threshold. Defaults to None.
|
| workers | int | No | Number of concurrent workers for data processing. Defaults to 0.
|
Outputs
| Name | Type | Description |
|---|---|---|
| results | list of tuple or list of str | Without flatten: If input is a string, returns a 1D list of (label_id, score) tuples sorted by highest score. If input is a list, returns a 2D list (one row per input). With flatten: Returns label strings instead of tuples. If flatten=True, returns a single-element list with the top label. If flatten=float, returns all labels above the threshold.
|
Key Methods
labels()
def labels(self):
Returns a list of all text classification model labels sorted in index order by reading from self.pipeline.model.config.id2label. Only meaningful for fixed classification models (dynamic=False).
outputs(results, labels, flatten)
def outputs(self, results, labels, flatten):
Processes raw pipeline results into the final output format. Handles both dynamic (zero-shot) and fixed result structures. When flatten is set, converts results to label strings and applies score thresholding using threshold = 0.0 for boolean flatten or the float value itself.
limit(result, labels)
def limit(self, result, labels):
Filters fixed-classification results using a provided label list. Resolves labels by either numeric index or case-insensitive string matching against the model's label2id configuration. Returns only those (label_id, score) tuples whose label_id is in the resolved match set.
Inheritance Chain
Labels -> HFPipeline -> Tensors -> Pipeline
The HFPipeline parent handles loading the Hugging Face Transformers pipeline, model quantization, device placement (CPU/GPU), and tokenizer length checking. The Tensors mixin provides tensor utility methods. Pipeline is the base class defining the batch() helper and the __call__ interface contract.
Usage Examples
Zero-Shot Classification (Dynamic Labels)
from txtai.pipeline import Labels
# Create a zero-shot classifier (default mode)
labels = Labels()
# Classify with arbitrary labels
result = labels("This is the best movie I have ever seen", labels=["positive", "negative"])
# Returns: [(0, 0.98), (1, 0.02)] - label index 0 ("positive") scores highest
# Classify a list of texts
results = labels(
["Great food!", "Terrible service"],
labels=["positive", "negative", "neutral"]
)
# Returns: [[(0, 0.95), ...], [(1, 0.88), ...]]
Fixed-Label Classification
from txtai.pipeline import Labels
# Load a pre-trained sentiment model with fixed labels
labels = Labels("distilbert-base-uncased-finetuned-sst-2-english", dynamic=False)
# Classify using the model's trained labels
result = labels("This product is amazing")
# Returns: [(1, 0.9998), (0, 0.0002)] - POSITIVE scores highest
# Filter to specific labels from the trained set
result = labels("This product is amazing", labels=["POSITIVE"])
# Returns: [(1, 0.9998)]
Flattened Output
from txtai.pipeline import Labels
labels = Labels()
# Get only the top label as a string
result = labels("I love this", labels=["positive", "negative"], flatten=True)
# Returns: ["positive"]
# Get all labels above a threshold
result = labels("I love this", labels=["positive", "negative", "neutral"], flatten=0.1)
# Returns: ["positive"] (only labels with score >= 0.1)