Implementation:Ollama Ollama MLXRunner Client
| Knowledge Sources | |
|---|---|
| Domains | MLX Runtime, Inference Server |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Wraps an MLX runner subprocess to implement the llm.LlamaServer interface, enabling Ollama to serve MLX-based models through its standard server architecture.
Description
The Client struct manages the lifecycle of an MLX runner subprocess. NewClient spawns the subprocess (ollama runner --mlx-engine) on a dynamically allocated port, configures library paths for MLX discovery, estimates VRAM from the model manifest, and polls the health endpoint until the runner is ready. It implements the full llm.LlamaServer interface by proxying HTTP requests to the subprocess for completion, tokenization, and health checks.
Usage
Used by the Ollama server to serve safetensors LLM models via the MLX inference engine, providing the same API surface as GGUF models served through llama.cpp.
Code Reference
Source Location
- Repository: Ollama
- File: x/mlxrunner/client.go
- Lines: 1-414
Signature
type Client struct {
port int
modelName string
vramSize uint64
done chan error
client *http.Client
lastErr string
lastErrLock sync.Mutex
mu sync.Mutex
cmd *exec.Cmd
}
func NewClient(modelName string) (*Client, error)
func (c *Client) Close() error
func (c *Client) Completion(ctx context.Context, req llm.CompletionRequest, fn func(llm.CompletionResponse)) error
func (c *Client) Ping(ctx context.Context) error
func (c *Client) Tokenize(ctx context.Context, content string) ([]int, error)
var _ llm.LlamaServer = (*Client)(nil)
Import
import "github.com/ollama/ollama/x/mlxrunner"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| modelName | string | Yes | Name of the model to load |
Outputs
| Name | Type | Description |
|---|---|---|
| *Client | *Client | Initialized client connected to running MLX subprocess |
| error | error | Non-nil if subprocess fails to start or become ready |
Usage Examples
client, err := mlxrunner.NewClient("my-model:latest")
if err != nil {
log.Fatal(err)
}
defer client.Close()
err = client.Completion(ctx, llm.CompletionRequest{
Prompt: "Hello, world!",
}, func(resp llm.CompletionResponse) {
fmt.Print(resp.Content)
})