Implementation:Neuml Txtai HFPipeline
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, NLP, Transformers |
| Last Updated | 2026-02-10 01:00 GMT |
Overview
Concrete tool for wrapping Hugging Face Transformers pipeline components with quantization and interface normalization provided by txtai.
Description
HFPipeline is a light wrapper around the Hugging Face Transformers pipeline component for selected tasks. It adds support for model quantization (int8 precision on CPU), automatic dtype resolution, and a consistent interface for various NLP tasks. The class inspects the pipeline function signature to intelligently split keyword arguments into model-level and pipeline-level arguments. It also detects unbounded tokenizers in older models and applies appropriate length checks.
Usage
Use HFPipeline as the base class for task-specific pipelines that leverage the standard Hugging Face Transformers pipeline abstraction, such as text classification, token classification, question answering, and image-to-text. It is the preferred choice when you want the convenience of the Transformers pipeline API with added quantization and configuration support.
Code Reference
Source Location
- Repository: Neuml_Txtai
- File:
src/python/txtai/pipeline/hfpipeline.py
Signature
class HFPipeline(Tensors):
def __init__(self, task, path=None, quantize=False, gpu=False, model=None, **kwargs)
def parseargs(self, **kwargs)
def maxlength(self)
Import
from txtai.pipeline.hfpipeline import HFPipeline
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| task | str | Yes | Pipeline task or category (e.g. "text-classification", "token-classification", "question-answering", "image-to-text"). |
| path | str or tuple | No | Path to model; accepts a Hugging Face model hub id, local path, or (model, tokenizer) tuple. Uses default model for task if not provided. |
| quantize | bool | No | If True, quantizes the model to int8 precision (CPU only). Defaults to False. |
| gpu | bool or int | No | True/False to enable GPU, or a specific GPU device id. Defaults to False. |
| model | Pipeline or HFPipeline | No | Optional existing pipeline model to wrap instead of loading a new one. |
| kwargs | dict | No | Additional keyword arguments passed to the Transformers pipeline constructor. |
Outputs
| Name | Type | Description |
|---|---|---|
| self.pipeline | transformers.Pipeline | The underlying Hugging Face Transformers pipeline instance, ready for inference. |
| (from parseargs) | tuple(dict, dict) | A tuple of (model_args, pipeline_args) split from the input kwargs. |
| (from maxlength) | int | The maximum sequence length for generate calls based on model and tokenizer config. |
Usage Examples
from txtai.pipeline.hfpipeline import HFPipeline
# Create a text classification pipeline with quantization
pipeline = HFPipeline("text-classification", path="distilbert-base-uncased-finetuned-sst-2-english", quantize=True)
# Run inference
result = pipeline.pipeline("This movie was great!")
# Create a pipeline from an (model, tokenizer) tuple
pipeline = HFPipeline("text-classification", path=("custom-model-path", "custom-tokenizer-path"), gpu=True)
# Get maximum sequence length
max_len = pipeline.maxlength()