Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama MLXRunner Client

From Leeroopedia
Knowledge Sources
Domains MLX Runtime, Inference Server
Last Updated 2025-02-15 00:00 GMT

Overview

Wraps an MLX runner subprocess to implement the llm.LlamaServer interface, enabling Ollama to serve MLX-based models through its standard server architecture.

Description

The Client struct manages the lifecycle of an MLX runner subprocess. NewClient spawns the subprocess (ollama runner --mlx-engine) on a dynamically allocated port, configures library paths for MLX discovery, estimates VRAM from the model manifest, and polls the health endpoint until the runner is ready. It implements the full llm.LlamaServer interface by proxying HTTP requests to the subprocess for completion, tokenization, and health checks.

Usage

Used by the Ollama server to serve safetensors LLM models via the MLX inference engine, providing the same API surface as GGUF models served through llama.cpp.

Code Reference

Source Location

  • Repository: Ollama
  • File: x/mlxrunner/client.go
  • Lines: 1-414

Signature

type Client struct {
    port        int
    modelName   string
    vramSize    uint64
    done        chan error
    client      *http.Client
    lastErr     string
    lastErrLock sync.Mutex
    mu          sync.Mutex
    cmd         *exec.Cmd
}

func NewClient(modelName string) (*Client, error)
func (c *Client) Close() error
func (c *Client) Completion(ctx context.Context, req llm.CompletionRequest, fn func(llm.CompletionResponse)) error
func (c *Client) Ping(ctx context.Context) error
func (c *Client) Tokenize(ctx context.Context, content string) ([]int, error)

var _ llm.LlamaServer = (*Client)(nil)

Import

import "github.com/ollama/ollama/x/mlxrunner"

I/O Contract

Inputs

Name Type Required Description
modelName string Yes Name of the model to load

Outputs

Name Type Description
*Client *Client Initialized client connected to running MLX subprocess
error error Non-nil if subprocess fails to start or become ready

Usage Examples

client, err := mlxrunner.NewClient("my-model:latest")
if err != nil {
    log.Fatal(err)
}
defer client.Close()

err = client.Completion(ctx, llm.CompletionRequest{
    Prompt: "Hello, world!",
}, func(resp llm.CompletionResponse) {
    fmt.Print(resp.Content)
})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment