Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Unslothai Unsloth Ollama Deployment

From Leeroopedia


Knowledge Sources
Domains Model_Deployment, Inference
Last Updated 2026-02-07 00:00 GMT

Overview

A deployment configuration technique that generates Ollama Modelfile templates matching the correct chat format for 50+ model families to enable local inference via Ollama.

Description

Ollama is a popular tool for running LLMs locally. It requires a Modelfile that specifies the GGUF model path, chat template format, and generation parameters. Each model family (Llama 3, Mistral, ChatML, Gemma, Qwen, Phi, etc.) requires a different template format.

Unsloth maintains a comprehensive registry mapping model names to Ollama-compatible templates, ensuring that exported GGUF models work correctly with Ollama out of the box. The template includes:

  1. FROM: Path to the GGUF file
  2. TEMPLATE: Go template string defining the chat format
  3. PARAMETER: Generation parameters (temperature, stop tokens)
  4. SYSTEM: Default system prompt

Usage

This principle is automatically applied during GGUF export (save_pretrained_gguf and push_to_hub_gguf). A Modelfile is generated alongside the GGUF file. Can also be used standalone to generate Ollama templates for existing models.

Theoretical Basis

Ollama template generation is a lookup-and-substitution process:

# Abstract Ollama template generation
template_key = MODEL_TO_OLLAMA_TEMPLATE_MAPPER[model_name]
modelfile = OLLAMA_TEMPLATES[template_key]
modelfile = modelfile.replace("{__FILE_LOCATION__}", gguf_path)
modelfile = modelfile.replace("{__EOS_TOKEN__}", eos_token)

The critical constraint is that the Ollama template must exactly match the model's training chat format, otherwise the model will produce degraded output due to template mismatch.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment