Implementation:Neuml Txtai PipelineModel Init

Overview

PipelineModel adapts txtai's LLM pipeline to the smolagents.Model interface, enabling any LLM backend supported by txtai to serve as the reasoning engine for an agent. This class handles message cleaning, inference invocation, stop-sequence trimming, and tool-call extraction.

API Signature

class PipelineModel(Model):
    def __init__(self, path=None, method=None, **kwargs):
        """
        Creates a new LLM model.

        Args:
            path: model path or instance
            method: llm model framework, infers from path if not provided
            kwargs: model keyword arguments
        """

Import

from txtai.agent.model import PipelineModel

Source

src/python/txtai/agent/model.py, lines 15-34.

Parameters

Parameter	Type	Required	Default	Description
`path`	`str` or `LLM`	No	`None`	Model path string (e.g., `"meta-llama/Meta-Llama-3.1-8B-Instruct"`) or a pre-built `LLM` pipeline instance. When a string is provided, a new `LLM` instance is created internally.
`method`	`str` or `None`	No	`None`	LLM model framework to use (e.g., `"litellm"`, `"llama.cpp"`). If not provided, the framework is inferred from the `path`.
`**kwargs`	keyword arguments	No	--	Additional keyword arguments passed to the `LLM` constructor (e.g., `quantize`, `gpu`, `api_key`).

Return Value

The constructor does not return a value. It initialises the PipelineModel instance with the following attributes:

self.llm -- The underlying LLM pipeline instance.
self.maxlength -- Maximum generation length, defaulting to 8192.

The parent Model constructor is also called with:

flatten_messages_as_text set to True unless the LLM supports vision inputs.
model_id set to the LLM generator's path.

How It Works

The constructor performs the following steps:

Checks whether path is already an LLM instance. If so, it is used directly. Otherwise, a new LLM(path, method, **kwargs) is created.
Sets self.maxlength = 8192 as the default maximum generation length.
Calls the parent Model.__init__ with:
- flatten_messages_as_text -- Set to not self.llm.isvision(). Vision-capable models receive messages as structured objects; text-only models receive flattened text.
- model_id -- Set to self.llm.generator.path, identifying the model.
- **kwargs -- Additional keyword arguments are forwarded.

Code Walkthrough

class PipelineModel(Model):
    def __init__(self, path=None, method=None, **kwargs):
        self.llm = path if isinstance(path, LLM) else LLM(path, method, **kwargs)
        self.maxlength = 8192

        # Call parent constructor
        super().__init__(
            flatten_messages_as_text=not self.llm.isvision(),
            model_id=self.llm.generator.path,
            **kwargs
        )

Key Methods

generate

def generate(self, messages, stop_sequences=None, response_format=None, tools_to_call_from=None, **kwargs):

The primary inference method, matching the smolagents specification:

Cleans and normalises the message list via self.clean(messages).
Calls the underlying LLM pipeline: self.llm(messages, maxlength=self.maxlength, stop=stop_sequences, **kwargs).
Trims output after any stop sequences using remove_content_after_stop_sequences.
Wraps the response in a ChatMessage with role "assistant".
If tools_to_call_from is provided, extracts tool call actions from the response text using regex parsing and attaches them to the message's tool_calls list.

parameters

def parameters(self, maxlength):

Updates the maximum generation length for subsequent calls. This is invoked by Agent.__call__ before each agent run.

clean

def clean(self, messages):

Normalises the message list:

Applies smolagents.get_clean_message_list with appropriate role conversions.
Converts any Enum role values to plain strings for cross-framework compatibility.

Usage Examples

From a Model Path

from txtai.agent.model import PipelineModel

model = PipelineModel(path="meta-llama/Meta-Llama-3.1-8B-Instruct")

From a Pre-built LLM Instance

from txtai.pipeline import LLM
from txtai.agent.model import PipelineModel

llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct", quantize=True)
model = PipelineModel(path=llm)

With an API-based Model

from txtai.agent.model import PipelineModel

model = PipelineModel(path="gpt-4o", method="litellm")

Setting Maximum Generation Length

model = PipelineModel(path="meta-llama/Meta-Llama-3.1-8B-Instruct")
model.parameters(maxlength=4096)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment