Principle:Ollama Ollama ToolCalling
| Knowledge Sources | |
|---|---|
| Domains | Tool Calling, Agent |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Tool Calling enables LLMs served through Ollama to invoke external functions by generating structured tool call requests as part of their output, supporting agentic workflows where the model can interact with external systems, databases, APIs, and code execution environments.
Core Concepts
Tool Definition
Tools are defined as JSON objects specifying a function name, description, and parameter schema (following JSON Schema conventions). These definitions are provided in the chat request and injected into the model's prompt context, informing the model about available tools and the expected format for invoking them. The tool schema serves as a contract between the model's output and the calling application's expectations.
Tool Call Generation
When the model determines that a tool should be invoked, it generates a structured output containing the tool name and arguments. Different model families use different formats for tool calls: some emit JSON objects, others use XML-like tags, and some use model-specific markup. The tool calling system must handle these format variations while presenting a unified interface to the API consumer.
Template-Based Formatting
Tool definitions are injected into the prompt using the model's chat template. Each model family has a specific template for rendering tool definitions in a format the model was trained to understand. The template system handles the translation between Ollama's generic tool definition format and the model-specific format, ensuring correct tool invocation behavior across different model architectures.
Multi-Turn Tool Use
Tool calling typically occurs across multiple conversation turns. The model generates a tool call, the application executes the function and returns the result, and the model incorporates the result into its continued generation. The chat message format includes a dedicated "tool" role for tool results, and the prompt construction logic correctly positions tool call and tool result messages within the conversation context.
Parallel Tool Calls
Some models support generating multiple tool calls in a single response, enabling parallel execution of independent functions. The parsing infrastructure handles extracting multiple tool calls from a single generation and returns them as an array in the API response.
Implementation Notes
Tool template rendering is in tools/template.go, which defines how tool definitions are formatted for each model family. Tool call testing and validation is in tools/tools.go. Extended tool calling support under x/tools/ provides additional model-specific tool call formats. The API layer in server/routes.go handles the tool call request/response protocol, and the OpenAI-compatible layer in openai/ translates between OpenAI's tool calling format and Ollama's internal representation.