Implementation:Neuml Txtai LLM Generation Base

Knowledge Sources	Neuml_Txtai
Domains	Machine Learning, NLP, LLM, Text Generation
Last Updated	2026-02-10 01:00 GMT

Overview

Concrete tool for LLM-based text generation with prompt formatting, streaming support, and response cleaning provided by txtai.

Description

Generation is the base class for all generative models in txtai. It provides common logic for building prompts (including template formatting), formatting inputs as chat messages, cleaning generated results, and handling streaming responses. The class supports multiple input formats: plain strings, lists of strings, and lists of chat message dictionaries with role and content keys. It includes intelligent detection of instruction tokens to decide whether to wrap inputs as chat messages. The class also handles thinking-tag removal (e.g. <think>...</think>) for reasoning models.

Usage

Use Generation as the base class when implementing a new LLM backend for txtai. Subclasses such as HuggingFace, LiteLLM, and OpenCode implement the stream method to provide actual inference. The base class handles prompt construction, output cleaning, and streaming orchestration.

Code Reference

Source Location

Repository: Neuml_Txtai
File: src/python/txtai/pipeline/llm/generation.py

Signature

class Generation:
    def __init__(self, path=None, template=None, **kwargs)
    def __call__(self, text, maxlength, stream, stop, defaultrole, stripthink, **kwargs)
    def ischat(self)
    def isvision(self)
    def format(self, texts, defaultrole)
    def execute(self, texts, maxlength, stream, stop, **kwargs)
    def clean(self, prompt, result, stripthink)
    def cleanstream(self, results)
    def cleanthink(self, text)
    def response(self, result)
    def stream(self, texts, maxlength, stream, stop, **kwargs)  # abstract

Import

from txtai.pipeline.llm.generation import Generation

I/O Contract

Inputs

Name	Type	Required	Description
text	str or list	Yes	Input text, list of strings, or list of chat message dicts with "role" and "content" keys.
maxlength	int	Yes	Maximum sequence length for generated output.
stream	bool	Yes	If True, streams the response token by token.
stop	list	Yes	List of stop strings to terminate generation.
defaultrole	str	Yes	Default role for text inputs: "auto" (infer), "user" (chat message), or "prompt" (raw prompt).
stripthink	bool	Yes	If True, strips thinking tags from output. Defaults to False when streaming, True otherwise.
template	str	No	Prompt template string applied to text inputs using TemplateFormatter.
kwargs	dict	No	Additional generation keyword arguments passed to the underlying model.

Outputs

Name	Type	Description
result	str or list or generator	A single generated string if input was a string, a list of strings if input was a list, or a generator if streaming is enabled.

Usage Examples

from txtai.pipeline import LLM

# Create an LLM pipeline (uses Generation base under the hood)
llm = LLM("google/flan-t5-small")

# Single string input
result = llm("Translate to French: Hello, how are you?")

# Chat-style input with role/content dicts
result = llm([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is machine learning?"}
])

# Batch input
results = llm(["Summarize: ...", "Translate: ..."])

# Streaming
for token in llm("Tell me a story", stream=True):
    print(token, end="")

Related Pages

Environment:Neuml_Txtai_Python_Core_Dependencies

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment