Workflow:Neuml Txtai Agent Orchestration

Knowledge Sources	txtai txtai Agent Docs txtai Agent Configuration smolagents
Domains	AI_Agents, LLMs, Tool_Use
Last Updated	2026-02-09 18:00 GMT

Overview

End-to-end process for creating an AI agent that autonomously combines embeddings search, Python functions, web tools, and LLM reasoning to solve complex multi-step user requests.

Description

This workflow demonstrates how to build autonomous AI agents using txtai's Agent class, which is built on top of the Hugging Face smolagents framework. Agents differ from RAG pipelines in that they dynamically decide which tools to use and in what order, iterating through a reasoning loop until they arrive at an answer. Tools can include embeddings databases for semantic search, arbitrary Python functions, built-in capabilities (web search, web page loading), and skill definitions loaded from markdown files. The Agent supports all LLM backends available in txtai: local Hugging Face models, llama.cpp GGUF models, and remote APIs via LiteLLM (OpenAI, Claude, Bedrock, etc.).

Usage

Execute this workflow when the user's task is complex, open-ended, or requires dynamic tool selection that cannot be predetermined. Agents are appropriate when multiple data sources or processing steps may be needed, but the exact sequence depends on intermediate results. For simple, predictable retrieval tasks, prefer the RAG Pipeline workflow instead.

Execution Steps

Step 1: Prepare the Embeddings Database

Set up one or more embeddings databases that the agent can search. These can be local indexes built with the Semantic Search Pipeline workflow, or remote pre-built indexes hosted on Hugging Face Hub. Each database is described with a name and description so the agent understands when to use it.

Key considerations:

Each embeddings tool needs a descriptive name and description for the agent's tool selection
Remote indexes use provider/container configuration (e.g., Hugging Face Hub)
Local indexes use a path to the saved embeddings directory
Multiple embeddings databases can be provided for different knowledge domains

Step 2: Define Custom Tools

Create Python functions that the agent can invoke. Each function must have a docstring describing its purpose, parameters, and return value. The agent uses these descriptions to decide when and how to call each tool. Functions can perform any operation: date/time queries, calculations, API calls, data transformations, etc.

Key considerations:

Functions must have descriptive docstrings with parameter and return type annotations
The agent reads the docstring to understand when to use the tool
Functions should be self-contained and handle their own error cases
Both synchronous and simple utility functions work well as agent tools

Step 3: Assemble the Tool List

Combine embeddings databases, custom functions, and built-in tools into a single tools list. Built-in tools include "websearch" for internet queries and "webview" for loading web page content. Skill files (skill.md) can also be added as knowledge-retrieval tools.

Key considerations:

Embeddings databases are passed as dictionaries with name, description, and path/provider
Python functions are passed directly as callable references
Built-in tools are passed as string identifiers ("websearch", "webview")
The agent selects tools based on their descriptions relative to the user's request

Step 4: Configure and Create the Agent

Instantiate the Agent class with the LLM model path, tools list, and execution parameters. Key parameters include max_steps (limits reasoning iterations), memory (number of prior interactions to retain), and custom instructions or templates for agent behavior.

Key considerations:

The model parameter accepts the same formats as the LLM pipeline (Hugging Face paths, llama.cpp GGUF, LiteLLM provider strings)
max_steps prevents runaway agent loops (default varies by backend)
Memory enables conversational context across multiple agent calls
Custom instructions (including agents.md files) can guide agent behavior and persona

Step 5: Execute Agent Queries

Call the agent with natural language requests. The agent enters a reasoning loop: it analyzes the request, selects appropriate tools, executes them, observes results, and iterates until it has a satisfactory answer. The agent autonomously decides the sequence and combination of tools.

Key considerations:

Complex requests may require multiple tool invocations across several reasoning steps
The agent can combine search results from multiple databases with web data and function outputs
Streaming mode provides real-time visibility into the agent's reasoning process
The reset parameter clears conversation memory for fresh interactions

Execution Diagram

GitHub URL

Workflow Repository