Implementation:Ollama Ollama Chat Handler
| Knowledge Sources | |
|---|---|
| Domains | Systems, Networking, API_Design |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for handling chat and generate inference requests with streaming response delivery provided by the server package.
Description
ChatHandler and GenerateHandler are the primary HTTP endpoint handlers for Ollama's inference API. They orchestrate the complete request lifecycle: parse the request, obtain a model runner from the scheduler, construct the prompt, invoke the inference engine, and stream the response back to the client.
ChatHandler (/api/chat) handles multi-turn conversations with support for tool calling, thinking mode, structured output (JSON schema), and image inputs. GenerateHandler (/api/generate) handles single-turn text completion with raw prompt input.
Both handlers support streaming (default) and non-streaming modes. In streaming mode, partial responses are written as newline-delimited JSON objects flushed after each token batch.
Usage
These are the primary inference endpoints for Ollama clients. ChatHandler is used for conversational interactions; GenerateHandler for raw text completion.
Code Reference
Source Location
- Repository: ollama
- File: server/routes.go
- Lines: L1983-2546 (ChatHandler), L183-664 (GenerateHandler)
Signature
func (s *Server) ChatHandler(c *gin.Context)
func (s *Server) GenerateHandler(c *gin.Context)
Import
import "github.com/ollama/ollama/server"
I/O Contract
Inputs (ChatHandler)
| Name | Type | Required | Description |
|---|---|---|---|
| c | *gin.Context | Yes | HTTP context with api.ChatRequest JSON body |
| Model | string | Yes | Model name (in request body) |
| Messages | []api.Message | Yes | Chat message history |
| Stream | *bool | No | Enable streaming (default: true) |
| Tools | []api.Tool | No | Function calling tool definitions |
| Format | json.RawMessage | No | Structured output schema |
| Options | api.Options | No | Runtime inference options |
Inputs (GenerateHandler)
| Name | Type | Required | Description |
|---|---|---|---|
| c | *gin.Context | Yes | HTTP context with api.GenerateRequest JSON body |
| Model | string | Yes | Model name (in request body) |
| Prompt | string | Yes | Raw text prompt |
| Stream | *bool | No | Enable streaming (default: true) |
| Images | []ImageData | No | Base64 image data for multimodal |
| Options | api.Options | No | Runtime inference options |
Outputs
| Name | Type | Description |
|---|---|---|
| Streaming response | NDJSON | Sequence of api.ChatResponse/api.GenerateResponse JSON objects |
| Final response | JSON | Single JSON object if stream=false, includes Done=true and metrics |
Usage Examples
Chat API Call
# Streaming chat request
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{"role": "user", "content": "Why is the sky blue?"}
]
}'
# Non-streaming
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{"role": "user", "content": "Why is the sky blue?"}
],
"stream": false
}'
Generate API Call
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt": "Once upon a time"
}'
Tool Calling
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{"role": "user", "content": "What is the weather in Paris?"}
],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
}'