Implementation:Neuml Txtai HFModel

Knowledge Sources	Neuml_Txtai
Domains	Machine Learning, NLP, Transformers
Last Updated	2026-02-10 01:00 GMT

Overview

Concrete tool for wrapping Hugging Face Transformers models with tensor management, tokenization, and quantization support provided by txtai.

Description

HFModel is a base pipeline class backed by a Hugging Face Transformers model. It extends the Tensors base class to provide device management (CPU/GPU), model quantization, and intelligent tokenization that handles overflowing tokens by splitting them into separate chunks. This class serves as a foundation for downstream pipelines such as CrossEncoder and LateEncoder that need direct model access rather than the higher-level Transformers pipeline API.

Usage

Use HFModel when you need a lower-level wrapper around a Hugging Face model that provides direct control over tokenization, batching, and device placement. It is the preferred base class for custom pipelines that require token-level manipulation or cannot use the standard Transformers pipeline abstraction.

Code Reference

Source Location

Repository: Neuml_Txtai
File: src/python/txtai/pipeline/hfmodel.py

Signature

class HFModel(Tensors):
    def __init__(self, path=None, quantize=False, gpu=False, batch=64)
    def prepare(self, model)
    def tokenize(self, tokenizer, texts)

Import

from txtai.pipeline.hfmodel import HFModel

I/O Contract

Inputs

Name	Type	Required	Description
path	str	No	Path to model; accepts a Hugging Face model hub id or local path. Uses default model for task if not provided.
quantize	bool	No	If True, applies dynamic quantization to the model (CPU only). Defaults to False.
gpu	bool or int	No	True/False to enable GPU, or a specific GPU device id. Defaults to False.
batch	int	No	Batch size used to incrementally process content. Defaults to 64.

Outputs

Name	Type	Description
(from prepare)	model	The prepared (optionally quantized) Hugging Face model.
(from tokenize)	tuple(dict, list)	A tuple of tokenized tensors (input_ids, attention_mask) moved to the target device and a list of indices for reconstructing original text positions.

Usage Examples

from txtai.pipeline.hfmodel import HFModel

# Create a model wrapper with GPU enabled and quantization off
model = HFModel(path="distilbert-base-uncased", quantize=False, gpu=True, batch=32)

# Prepare a loaded Hugging Face model for inference
from transformers import AutoModel
raw_model = AutoModel.from_pretrained("distilbert-base-uncased")
prepared = model.prepare(raw_model)

# Tokenize a batch of texts
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
tokens, indices = model.tokenize(tokenizer, ["Hello world", "txtai is great"])

Related Pages

Environment:Neuml_Txtai_Python_Core_Dependencies

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment