Principle:Neuml Txtai Agent Tool Definition

Overview

An agent is only as capable as the tools it can access. The Agent Tool Definition principle addresses how custom tools -- both function tools and skill tools -- are defined, described, and made available to an LLM-based agent in txtai. Good tool definitions enable the language model to understand what each tool does, when to invoke it, and how to pass arguments correctly.

Function Calling in Agents

Modern LLM agent frameworks rely on function calling (also known as tool calling) as their primary mechanism for extending model capabilities. The pattern works as follows:

Each tool is described by a schema comprising a name, a natural-language description, a set of typed input parameters, and an output type.
The agent's system prompt includes these schemas so the LLM can reason about which tool best addresses each sub-task.
When the LLM decides to use a tool, it emits a structured action containing the tool name and argument values.
The orchestrator maps the action to the real function, executes it, and returns the result.

The quality of tool definitions is critical. Vague descriptions lead to incorrect tool selection; poorly typed inputs lead to runtime errors. txtai provides structured interfaces for defining tools so that these schemas are accurate and consistent.

Two Kinds of Custom Tools

Function Tools

A FunctionTool wraps an arbitrary Python callable (function, method, or callable object) along with descriptive metadata. The metadata is provided as a configuration dictionary containing:

name -- Identifier used in the agent prompt and action parsing.
description -- Natural-language explanation of what the function does.
inputs -- Dictionary mapping parameter names to their types and descriptions.
output (or output_type) -- Description of the return value type.
target -- The actual callable to invoke.

Function tools are the most flexible tool type. They can wrap any Python function, from simple arithmetic to complex API calls.

Skill Tools

A SkillTool loads a skill.md file -- a Markdown document with optional YAML frontmatter. The frontmatter provides the tool's name and description; the Markdown body contains the knowledge content. When invoked, the skill tool does not execute code. Instead, it returns the Markdown content alongside the user's request, asking the LLM to find the best answer within that content.

Skill tools are useful for:

Static knowledge bases that are small enough to fit in a prompt.
Guided procedures where the answer is contained in a well-structured document.
Low-code scenarios where non-programmers can contribute tools by writing Markdown files.

Tool Interfaces

All tools in txtai's agent framework implement the smolagents.Tool interface, which requires:

name -- A short, unique string identifier.
description -- A sentence or paragraph explaining the tool's purpose.
inputs -- A dictionary of parameter definitions, each with type and description keys.
output_type -- The type of value returned by the tool.
forward(*args, **kwargs) -- The method that performs the actual work.

By adhering to this interface, all tool types (embeddings, function, skill) present a uniform contract to the agent orchestrator.

The ToolFactory

The ToolFactory serves as the central registry and construction point for tools. It accepts a heterogeneous list of tool specifications and normalises them:

Tool instances are passed through as-is.
Callables (functions, methods, callable objects) are auto-wrapped using type annotations and docstrings when available, or via manual configuration.
Dictionaries are dispatched to either EmbeddingsTool (for embeddings-backed search) or FunctionTool (for general functions).
String aliases map to built-in defaults ("python", "websearch", "webview").
URL strings starting with http import MCP tool collections via mcpadapt.
File paths ending in .md are loaded as skill tools.

This factory approach means that agent configurations can mix and match tool types freely.

Automatic Tool Creation

When a callable has proper type annotations and a Google-style docstring, the ToolFactory.createtool method can automatically generate a fully described tool using the smolagents.tool decorator function. If annotations are missing, it falls back to fromdocs, which parses the docstring to extract parameter descriptions and builds a FunctionTool manually.

This dual-path strategy ensures that:

Well-annotated functions require zero configuration beyond passing the callable.
Legacy or third-party functions without annotations can still be wrapped with reasonable defaults.

Design Considerations

Naming Conventions

Tool names should be lowercase, descriptive, and free of special characters. They appear in the LLM's prompt and in parsed actions. Ambiguous names (e.g., search when multiple search tools exist) degrade agent accuracy.

Input Typing

The inputs dictionary uses simple type strings ("string", "any", "integer", etc.). While these are not enforced at runtime, they guide the LLM's argument generation. Setting types to "any" is a safe fallback when the exact type is unknown.

Composability

Because all tools share the same interface, they compose naturally. An agent can be given an embeddings tool, a Python interpreter, a web search tool, and a custom function tool, and the LLM will select among them based on the task at hand.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment