Implementation:Neuml Txtai PipelineModel Init
Overview
PipelineModel adapts txtai's LLM pipeline to the smolagents.Model interface, enabling any LLM backend supported by txtai to serve as the reasoning engine for an agent. This class handles message cleaning, inference invocation, stop-sequence trimming, and tool-call extraction.
API Signature
class PipelineModel(Model):
def __init__(self, path=None, method=None, **kwargs):
"""
Creates a new LLM model.
Args:
path: model path or instance
method: llm model framework, infers from path if not provided
kwargs: model keyword arguments
"""
Import
from txtai.agent.model import PipelineModel
Source
src/python/txtai/agent/model.py, lines 15-34.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
path |
str or LLM |
No | None |
Model path string (e.g., "meta-llama/Meta-Llama-3.1-8B-Instruct") or a pre-built LLM pipeline instance. When a string is provided, a new LLM instance is created internally.
|
method |
str or None |
No | None |
LLM model framework to use (e.g., "litellm", "llama.cpp"). If not provided, the framework is inferred from the path.
|
**kwargs |
keyword arguments | No | -- | Additional keyword arguments passed to the LLM constructor (e.g., quantize, gpu, api_key).
|
Return Value
The constructor does not return a value. It initialises the PipelineModel instance with the following attributes:
self.llm-- The underlyingLLMpipeline instance.self.maxlength-- Maximum generation length, defaulting to8192.
The parent Model constructor is also called with:
flatten_messages_as_textset toTrueunless the LLM supports vision inputs.model_idset to the LLM generator's path.
How It Works
The constructor performs the following steps:
- Checks whether
pathis already anLLMinstance. If so, it is used directly. Otherwise, a newLLM(path, method, **kwargs)is created. - Sets
self.maxlength = 8192as the default maximum generation length. - Calls the parent
Model.__init__with:flatten_messages_as_text-- Set tonot self.llm.isvision(). Vision-capable models receive messages as structured objects; text-only models receive flattened text.model_id-- Set toself.llm.generator.path, identifying the model.**kwargs-- Additional keyword arguments are forwarded.
Code Walkthrough
class PipelineModel(Model):
def __init__(self, path=None, method=None, **kwargs):
self.llm = path if isinstance(path, LLM) else LLM(path, method, **kwargs)
self.maxlength = 8192
# Call parent constructor
super().__init__(
flatten_messages_as_text=not self.llm.isvision(),
model_id=self.llm.generator.path,
**kwargs
)
Key Methods
generate
def generate(self, messages, stop_sequences=None, response_format=None, tools_to_call_from=None, **kwargs):
The primary inference method, matching the smolagents specification:
- Cleans and normalises the message list via
self.clean(messages). - Calls the underlying LLM pipeline:
self.llm(messages, maxlength=self.maxlength, stop=stop_sequences, **kwargs). - Trims output after any stop sequences using
remove_content_after_stop_sequences. - Wraps the response in a
ChatMessagewith role"assistant". - If
tools_to_call_fromis provided, extracts tool call actions from the response text using regex parsing and attaches them to the message'stool_callslist.
parameters
def parameters(self, maxlength):
Updates the maximum generation length for subsequent calls. This is invoked by Agent.__call__ before each agent run.
clean
def clean(self, messages):
Normalises the message list:
- Applies
smolagents.get_clean_message_listwith appropriate role conversions. - Converts any
Enumrole values to plain strings for cross-framework compatibility.
Usage Examples
From a Model Path
from txtai.agent.model import PipelineModel
model = PipelineModel(path="meta-llama/Meta-Llama-3.1-8B-Instruct")
From a Pre-built LLM Instance
from txtai.pipeline import LLM
from txtai.agent.model import PipelineModel
llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct", quantize=True)
model = PipelineModel(path=llm)
With an API-based Model
from txtai.agent.model import PipelineModel
model = PipelineModel(path="gpt-4o", method="litellm")
Setting Maximum Generation Length
model = PipelineModel(path="meta-llama/Meta-Llama-3.1-8B-Instruct")
model.parameters(maxlength=4096)
See Also
- Neuml_Txtai_Agent_LLM_Configuration -- Principle behind configuring the agent LLM backbone
- Neuml_Txtai_Agent_Init -- How PipelineModel is created during Agent construction
- Neuml_Txtai_Agent_Call -- How parameters are set before each agent run