Workflow:Neuml Txtai Agent Orchestration
| Knowledge Sources | |
|---|---|
| Domains | AI_Agents, LLMs, Tool_Use |
| Last Updated | 2026-02-09 18:00 GMT |
Overview
End-to-end process for creating an AI agent that autonomously combines embeddings search, Python functions, web tools, and LLM reasoning to solve complex multi-step user requests.
Description
This workflow demonstrates how to build autonomous AI agents using txtai's Agent class, which is built on top of the Hugging Face smolagents framework. Agents differ from RAG pipelines in that they dynamically decide which tools to use and in what order, iterating through a reasoning loop until they arrive at an answer. Tools can include embeddings databases for semantic search, arbitrary Python functions, built-in capabilities (web search, web page loading), and skill definitions loaded from markdown files. The Agent supports all LLM backends available in txtai: local Hugging Face models, llama.cpp GGUF models, and remote APIs via LiteLLM (OpenAI, Claude, Bedrock, etc.).
Usage
Execute this workflow when the user's task is complex, open-ended, or requires dynamic tool selection that cannot be predetermined. Agents are appropriate when multiple data sources or processing steps may be needed, but the exact sequence depends on intermediate results. For simple, predictable retrieval tasks, prefer the RAG Pipeline workflow instead.
Execution Steps
Step 1: Prepare the Embeddings Database
Set up one or more embeddings databases that the agent can search. These can be local indexes built with the Semantic Search Pipeline workflow, or remote pre-built indexes hosted on Hugging Face Hub. Each database is described with a name and description so the agent understands when to use it.
Key considerations:
- Each embeddings tool needs a descriptive name and description for the agent's tool selection
- Remote indexes use provider/container configuration (e.g., Hugging Face Hub)
- Local indexes use a path to the saved embeddings directory
- Multiple embeddings databases can be provided for different knowledge domains
Step 2: Define Custom Tools
Create Python functions that the agent can invoke. Each function must have a docstring describing its purpose, parameters, and return value. The agent uses these descriptions to decide when and how to call each tool. Functions can perform any operation: date/time queries, calculations, API calls, data transformations, etc.
Key considerations:
- Functions must have descriptive docstrings with parameter and return type annotations
- The agent reads the docstring to understand when to use the tool
- Functions should be self-contained and handle their own error cases
- Both synchronous and simple utility functions work well as agent tools
Step 3: Assemble the Tool List
Combine embeddings databases, custom functions, and built-in tools into a single tools list. Built-in tools include "websearch" for internet queries and "webview" for loading web page content. Skill files (skill.md) can also be added as knowledge-retrieval tools.
Key considerations:
- Embeddings databases are passed as dictionaries with name, description, and path/provider
- Python functions are passed directly as callable references
- Built-in tools are passed as string identifiers ("websearch", "webview")
- The agent selects tools based on their descriptions relative to the user's request
Step 4: Configure and Create the Agent
Instantiate the Agent class with the LLM model path, tools list, and execution parameters. Key parameters include max_steps (limits reasoning iterations), memory (number of prior interactions to retain), and custom instructions or templates for agent behavior.
Key considerations:
- The model parameter accepts the same formats as the LLM pipeline (Hugging Face paths, llama.cpp GGUF, LiteLLM provider strings)
- max_steps prevents runaway agent loops (default varies by backend)
- Memory enables conversational context across multiple agent calls
- Custom instructions (including agents.md files) can guide agent behavior and persona
Step 5: Execute Agent Queries
Call the agent with natural language requests. The agent enters a reasoning loop: it analyzes the request, selects appropriate tools, executes them, observes results, and iterates until it has a satisfactory answer. The agent autonomously decides the sequence and combination of tools.
Key considerations:
- Complex requests may require multiple tool invocations across several reasoning steps
- The agent can combine search results from multiple databases with web data and function outputs
- Streaming mode provides real-time visibility into the agent's reasoning process
- The reset parameter clears conversation memory for fresh interactions