Implementation:Ggml org Llama cpp Convert Llama2c To GGML

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Model_Conversion
Last Updated	2026-02-15 00:00 GMT

Overview

Converts Karpathy's llama2.c model format to GGUF format compatible with llama.cpp's inference engine.

Description

This C++ program defines llama2.c model structures (Config, TransformerWeights) and reads the binary checkpoint format. It maps llama2.c weight names to GGUF tensor names (e.g., token_embd.weight, blk.%d.attn_q.weight). The converter copies vocabulary from an existing GGUF model or llama2.c tokenizer file, and writes all metadata (architecture, context length, head counts, etc.) and tensor data to a new GGUF file using the gguf API. The implementation uses GGUF key constants like KV_GENERAL_ARCHITECTURE, KV_CONTEXT_LENGTH, KV_EMBEDDING_LENGTH, etc.

Usage

Use this tool to enable interoperability between the educational llama2.c project and llama.cpp's production inference engine, allowing small llama2.c-trained models to run with full llama.cpp optimizations.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp
Lines: 1-941

Signature

// llama2.c Config structure
typedef struct {
    int dim;          // transformer dimension
    int hidden_dim;   // for ffn layers
    int n_layers;     // number of layers
    int n_heads;      // number of query heads
    int n_kv_heads;   // number of key/value heads
    int vocab_size;   // vocabulary size
    int seq_len;      // max sequence length
} Config;

// GGUF key definitions
#define KV_GENERAL_ARCHITECTURE  "general.architecture"
#define KV_CONTEXT_LENGTH        "llama.context_length"
#define KV_EMBEDDING_LENGTH      "llama.embedding_length"
#define KV_BLOCK_COUNT           "llama.block_count"

// GGUF tensor name templates
#define TN_TOKEN_EMBD  "token_embd.weight"
#define TN_ATTN_Q      "blk.%d.attn_q.weight"
#define TN_FFN_GATE    "blk.%d.ffn_gate.weight"

Import

#include "ggml.h"
#include "gguf.h"
#include "llama.h"
#include "common.h"
#include "log.h"
#include <unordered_map>
#include <vector>
#include <cstring>

I/O Contract

Inputs

Name	Type	Required	Description
checkpoint_file	file path	Yes	Path to the llama2.c binary checkpoint file
tokenizer_source	file path	Yes	Path to existing GGUF model (for vocab) or llama2.c tokenizer.bin
output_file	file path	Yes	Desired output path for the GGUF file

Outputs

Name	Type	Description
gguf_file	.gguf file	Converted model with llama architecture metadata, vocabulary, and all transformer weights

Usage Examples

// Command-line usage:
// ./convert-llama2c-to-ggml --copy-vocab-from-model llama-2-7b.gguf \
//     --llama2c-model stories15M.bin \
//     --llama2c-output-model stories15M.gguf

Related Pages

Principle:Ggml_org_Llama_cpp_ModelConversion

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment