Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp Convert Llama2c To GGML

From Leeroopedia
Knowledge Sources
Domains Model_Conversion
Last Updated 2026-02-15 00:00 GMT

Overview

Converts Karpathy's llama2.c model format to GGUF format compatible with llama.cpp's inference engine.

Description

This C++ program defines llama2.c model structures (Config, TransformerWeights) and reads the binary checkpoint format. It maps llama2.c weight names to GGUF tensor names (e.g., token_embd.weight, blk.%d.attn_q.weight). The converter copies vocabulary from an existing GGUF model or llama2.c tokenizer file, and writes all metadata (architecture, context length, head counts, etc.) and tensor data to a new GGUF file using the gguf API. The implementation uses GGUF key constants like KV_GENERAL_ARCHITECTURE, KV_CONTEXT_LENGTH, KV_EMBEDDING_LENGTH, etc.

Usage

Use this tool to enable interoperability between the educational llama2.c project and llama.cpp's production inference engine, allowing small llama2.c-trained models to run with full llama.cpp optimizations.

Code Reference

Source Location

  • Repository: Ggml_org_Llama_cpp
  • File: examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp
  • Lines: 1-941

Signature

// llama2.c Config structure
typedef struct {
    int dim;          // transformer dimension
    int hidden_dim;   // for ffn layers
    int n_layers;     // number of layers
    int n_heads;      // number of query heads
    int n_kv_heads;   // number of key/value heads
    int vocab_size;   // vocabulary size
    int seq_len;      // max sequence length
} Config;

// GGUF key definitions
#define KV_GENERAL_ARCHITECTURE  "general.architecture"
#define KV_CONTEXT_LENGTH        "llama.context_length"
#define KV_EMBEDDING_LENGTH      "llama.embedding_length"
#define KV_BLOCK_COUNT           "llama.block_count"

// GGUF tensor name templates
#define TN_TOKEN_EMBD  "token_embd.weight"
#define TN_ATTN_Q      "blk.%d.attn_q.weight"
#define TN_FFN_GATE    "blk.%d.ffn_gate.weight"

Import

#include "ggml.h"
#include "gguf.h"
#include "llama.h"
#include "common.h"
#include "log.h"
#include <unordered_map>
#include <vector>
#include <cstring>

I/O Contract

Inputs

Name Type Required Description
checkpoint_file file path Yes Path to the llama2.c binary checkpoint file
tokenizer_source file path Yes Path to existing GGUF model (for vocab) or llama2.c tokenizer.bin
output_file file path Yes Desired output path for the GGUF file

Outputs

Name Type Description
gguf_file .gguf file Converted model with llama architecture metadata, vocabulary, and all transformer weights

Usage Examples

// Command-line usage:
// ./convert-llama2c-to-ggml --copy-vocab-from-model llama-2-7b.gguf \
//     --llama2c-model stories15M.bin \
//     --llama2c-output-model stories15M.gguf

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment