Implementation:Ollama Ollama Inference Handler

Knowledge Sources	Ollama
Domains	Systems, Model_Serving
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for dispatching inference requests through Ollama's native handler pipeline provided by the server package.

Description

In the OpenAI compatibility context, ChatHandler and GenerateHandler serve as the inference dispatch point. When OpenAI middleware translates a request and replaces the request body, the native handler processes it identically to a native Ollama request. The response is captured by the middleware's custom response writer for format translation.

This implementation is shared with the Response_Streaming principle's implementation (Chat_Handler) but documented separately here to reflect the dispatch role in the OpenAI compatibility workflow.

Usage

Invoked automatically by the middleware chain. The native handlers are unaware they are serving an OpenAI-format request.

Code Reference

Source Location

Repository: ollama
File: server/routes.go
Lines: L1983-2546 (ChatHandler), L183-664 (GenerateHandler)

Signature

func (s *Server) ChatHandler(c *gin.Context)
func (s *Server) GenerateHandler(c *gin.Context)

Import

import "github.com/ollama/ollama/server"

I/O Contract

Inputs

Name	Type	Required	Description
c	*gin.Context	Yes	HTTP context with translated Ollama request body (from middleware)

Outputs

Name	Type	Description
Response stream	bytes	Written to c.Writer (which is the middleware's custom ChatWriter/CompletionWriter)

Usage Examples

OpenAI Streaming Chat

# The client sees OpenAI-format SSE stream
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'
# Response: data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hi"},...}],...}

Related Pages

Implements Principle

Principle:Ollama_Ollama_Inference_Dispatch

Requires Environment

Environment:Ollama_Ollama_Go_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment