Implementation:Neuml Txtai SBert Vectors

Knowledge Sources	Neuml_Txtai
Domains	Embeddings, Vectors, Sentence Transformers, Deep Learning
Last Updated	2026-02-10 01:00 GMT

Overview

Concrete tool for building embedding vectors using sentence-transformers (SBERT) models provided by txtai.

Description

The STVectors class extends the base Vectors class to generate embeddings using the sentence-transformers library. This is the default and most commonly used vectors backend in txtai, supporting hundreds of pretrained models from the Hugging Face Hub.

Key features:

GPU management: Supports single GPU (default), CPU-only, or multi-GPU configurations. Setting gpu="all" enables multiprocessing pooling across all available accelerator devices (detected via Models.acceleratorcount()). Multi-GPU pooling is only activated when more than one device is available.
Multi-process pooling: When multiple GPUs are detected, the class spawns a process pool via model.start_multi_process_pool() for parallel encoding, and properly shuts it down in the close method.
Category-aware encoding: The encode method supports different encoding strategies based on the category parameter: "query" uses encode_query, "data" uses encode_document, and None uses the standard encode method. This supports asymmetric embedding models that encode queries and documents differently.
Additional model and encoding arguments: Extra keyword arguments for model loading and encoding can be passed via vectors and encodeargs config keys respectively.

Usage

Use STVectors as the primary vectors backend for most txtai use cases. It is the default when a Hugging Face model path is provided that is not recognized as another model type (LiteLLM, llama.cpp, Model2Vec, or WordVectors). Suitable for semantic search, clustering, classification, and any task requiring dense sentence embeddings.

Code Reference

Source Location

Repository: Neuml_Txtai
File: src/python/txtai/vectors/dense/sbert.py

Signature

class STVectors(Vectors):
    def __init__(self, config, scoring, models)
    def loadmodel(self, path) -> SentenceTransformer
    def encode(self, data, category=None) -> ndarray
    def close(self)
    def loadencoder(self, path, device, **kwargs) -> SentenceTransformer

Import

from txtai.vectors.dense.sbert import STVectors

I/O Contract

Inputs

Name	Type	Required	Description
config	dict	Yes	Configuration dictionary. Must include path (str, HF Hub model ID or local path). Optional keys: gpu (bool or "all", default True), vectors (dict of SentenceTransformer constructor kwargs), encodeargs (dict of encoding kwargs), encodebatch (int, encoding batch size).
scoring	Scoring	No	Optional scoring instance for token weighting.
models	object	No	Shared models cache instance.
data	list[str]	Yes (encode)	List of text strings to encode.
category	str	No	Encoding category: "query" for query encoding, "data" for document encoding, None for default encoding.

Outputs

Name	Type	Description
embeddings	ndarray	NumPy array of embedding vectors with shape (n, dimensions).
model	SentenceTransformer	Loaded sentence-transformers model instance.

Usage Examples

from txtai.embeddings import Embeddings

# Use a sentence-transformers model (default backend)
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2"
})

# Index documents
embeddings.index([
    (0, "semantic search with transformers", None),
    (1, "natural language inference tasks", None),
    (2, "text classification and clustering", None),
])

# Search
results = embeddings.search("transformer-based NLP", limit=5)

# Multi-GPU configuration
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "gpu": "all"
})

# Asymmetric model with different query/document encoding
embeddings = Embeddings({
    "path": "intfloat/e5-base-v2",
    "encodeargs": {"normalize_embeddings": True}
})

# Additional model arguments
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "vectors": {"trust_remote_code": True}
})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment