Implementation:Ollama Ollama Llama Model Saver
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Model Serialization |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the model saver that writes llama model data back to GGUF format, used primarily for model quantization output.
Description
The constructor initializes an empty GGUF context. Provides type-overloaded add_kv methods for writing metadata to the GGUF context, with support for per-layer arrays that compress to scalars when all values are identical. add_kv_from_model writes all model hyperparameters and vocabulary data as GGUF metadata. add_tensors_from_model adds all model tensor references. save writes the complete GGUF file to disk.
Usage
Used by the quantization pipeline to save quantized model weights to GGUF format, and for model format conversion.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/llama-model-saver.cpp - Lines: 1-282
Signature
llama_model_saver::llama_model_saver(const struct llama_model & model);
~llama_model_saver();
void add_kv(const enum llm_kv key, const uint32_t value);
void add_kv(const enum llm_kv key, const int32_t value);
void add_kv(const enum llm_kv key, const float value);
void add_kv(const enum llm_kv key, const bool value);
void add_kv(const enum llm_kv key, const char * value);
template <typename Container>
void add_kv(const enum llm_kv key, const Container & value, const bool per_layer);
void add_tensor(const struct ggml_tensor * tensor);
void add_kv_from_model();
void add_tensors_from_model();
void save(const std::string & fname);
Import
#include "llama-model-saver.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | const llama_model & | Yes | The model to serialize |
| fname | const std::string & | Yes | Output file path for save() |
Outputs
| Name | Type | Description |
|---|---|---|
| GGUF file | file | Complete GGUF model file written to disk |
Usage Examples
llama_model_saver saver(model);
saver.add_kv_from_model();
saver.add_tensors_from_model();
saver.save("output.gguf");