Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Neuml Txtai LLM Generation Base

From Leeroopedia


Knowledge Sources
Domains Machine Learning, NLP, LLM, Text Generation
Last Updated 2026-02-10 01:00 GMT

Overview

Concrete tool for LLM-based text generation with prompt formatting, streaming support, and response cleaning provided by txtai.

Description

Generation is the base class for all generative models in txtai. It provides common logic for building prompts (including template formatting), formatting inputs as chat messages, cleaning generated results, and handling streaming responses. The class supports multiple input formats: plain strings, lists of strings, and lists of chat message dictionaries with role and content keys. It includes intelligent detection of instruction tokens to decide whether to wrap inputs as chat messages. The class also handles thinking-tag removal (e.g. <think>...</think>) for reasoning models.

Usage

Use Generation as the base class when implementing a new LLM backend for txtai. Subclasses such as HuggingFace, LiteLLM, and OpenCode implement the stream method to provide actual inference. The base class handles prompt construction, output cleaning, and streaming orchestration.

Code Reference

Source Location

  • Repository: Neuml_Txtai
  • File: src/python/txtai/pipeline/llm/generation.py

Signature

class Generation:
    def __init__(self, path=None, template=None, **kwargs)
    def __call__(self, text, maxlength, stream, stop, defaultrole, stripthink, **kwargs)
    def ischat(self)
    def isvision(self)
    def format(self, texts, defaultrole)
    def execute(self, texts, maxlength, stream, stop, **kwargs)
    def clean(self, prompt, result, stripthink)
    def cleanstream(self, results)
    def cleanthink(self, text)
    def response(self, result)
    def stream(self, texts, maxlength, stream, stop, **kwargs)  # abstract

Import

from txtai.pipeline.llm.generation import Generation

I/O Contract

Inputs

Name Type Required Description
text str or list Yes Input text, list of strings, or list of chat message dicts with "role" and "content" keys.
maxlength int Yes Maximum sequence length for generated output.
stream bool Yes If True, streams the response token by token.
stop list Yes List of stop strings to terminate generation.
defaultrole str Yes Default role for text inputs: "auto" (infer), "user" (chat message), or "prompt" (raw prompt).
stripthink bool Yes If True, strips thinking tags from output. Defaults to False when streaming, True otherwise.
template str No Prompt template string applied to text inputs using TemplateFormatter.
kwargs dict No Additional generation keyword arguments passed to the underlying model.

Outputs

Name Type Description
result str or list or generator A single generated string if input was a string, a list of strings if input was a list, or a generator if streaming is enabled.

Usage Examples

from txtai.pipeline import LLM

# Create an LLM pipeline (uses Generation base under the hood)
llm = LLM("google/flan-t5-small")

# Single string input
result = llm("Translate to French: Hello, how are you?")

# Chat-style input with role/content dicts
result = llm([
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is machine learning?"}
])

# Batch input
results = llm(["Summarize: ...", "Translate: ..."])

# Streaming
for token in llm("Tell me a story", stream=True):
    print(token, end="")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment