Implementation:Ollama Ollama Llama Model Saver

Knowledge Sources	Ollama
Domains	LLM Inference, Model Serialization
Last Updated	2025-02-15 00:00 GMT

Overview

Implements the model saver that writes llama model data back to GGUF format, used primarily for model quantization output.

Description

The constructor initializes an empty GGUF context. Provides type-overloaded add_kv methods for writing metadata to the GGUF context, with support for per-layer arrays that compress to scalars when all values are identical. add_kv_from_model writes all model hyperparameters and vocabulary data as GGUF metadata. add_tensors_from_model adds all model tensor references. save writes the complete GGUF file to disk.

Usage

Used by the quantization pipeline to save quantized model weights to GGUF format, and for model format conversion.

Code Reference

Source Location

Repository: Ollama
File: llama/llama.cpp/src/llama-model-saver.cpp
Lines: 1-282

Signature

llama_model_saver::llama_model_saver(const struct llama_model & model);
~llama_model_saver();

void add_kv(const enum llm_kv key, const uint32_t value);
void add_kv(const enum llm_kv key, const int32_t value);
void add_kv(const enum llm_kv key, const float value);
void add_kv(const enum llm_kv key, const bool value);
void add_kv(const enum llm_kv key, const char * value);

template <typename Container>
void add_kv(const enum llm_kv key, const Container & value, const bool per_layer);

void add_tensor(const struct ggml_tensor * tensor);
void add_kv_from_model();
void add_tensors_from_model();
void save(const std::string & fname);

Import

#include "llama-model-saver.h"

I/O Contract

Inputs

Name	Type	Required	Description
model	const llama_model &	Yes	The model to serialize
fname	const std::string &	Yes	Output file path for save()

Outputs

Name	Type	Description
GGUF file	file	Complete GGUF model file written to disk

Usage Examples

llama_model_saver saver(model);
saver.add_kv_from_model();
saver.add_tensors_from_model();
saver.save("output.gguf");

Related Pages

Principle:Ollama_Ollama_Model_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment