Implementation:Ggml org Llama cpp Compare Logits

Field	Value
Implementation Name	Compare Logits
Type	Wrapper Doc
Wraps	NumPy-based logit comparison between PyTorch and llama.cpp outputs
Status	Active

Overview

Description

The compare-logits.py script provides a lightweight sanity check for verifying that a converted GGUF model produces logits consistent with the original PyTorch model. It is part of the llama.cpp model conversion verification toolkit, located at examples/model-conversion/scripts/causal/compare-logits.py.

The script performs two checks in sequence:

Token comparison: Verifies that both models produced identical token sequences for the same input, using the compare_tokens() utility from common.py.
Logit comparison: Loads pre-computed logit vectors from binary files, checks shape compatibility, computes the maximum absolute difference, and compares the top-10 predictions from both models.

The core comparison logic is in the quick_logits_check() function (lines 12-39). The main() function (lines 41-87) handles file discovery via environment variables and orchestrates both checks.

Usage

The script expects two environment variables to be set:

export MODEL_PATH=/path/to/original/pytorch/model
export CONVERTED_MODEL=/path/to/converted/model.gguf
python examples/model-conversion/scripts/causal/compare-logits.py

Prerequisites: logit binary files must be pre-generated by running the original model and the converted model on the same prompt, producing files in the data/ directory.

Code Reference

Source Location

File	Lines	Description
`examples/model-conversion/scripts/causal/compare-logits.py`	12-39	`quick_logits_check()` function
`examples/model-conversion/scripts/causal/compare-logits.py`	41-87	`main()` function
`examples/model-conversion/scripts/utils/common.py`	208-248	`compare_tokens()` utility
`examples/model-conversion/scripts/utils/common.py`	280-299	`exit_with_warning()` utility with version check

Signature

quick_logits_check():

def quick_logits_check(pytorch_file, llamacpp_file):
    """Lightweight sanity check before NMSE"""

    try:
        pytorch_logits = np.fromfile(pytorch_file, dtype=np.float32)
        llamacpp_logits = np.fromfile(llamacpp_file, dtype=np.float32)
    except Exception as e:
        print(f"Failed to load files - {e}")
        return False

    # Check shapes match
    if pytorch_logits.shape != llamacpp_logits.shape:
        print(f"Shape mismatch - PyTorch: {pytorch_logits.shape}, llama.cpp: {llamacpp_logits.shape}")
        return False

    # Calculate key metrics
    diff = pytorch_logits - llamacpp_logits
    abs_diff = np.abs(diff)
    max_diff = np.max(abs_diff)

    # Get top 10 predictions from both models
    pytorch_top10 = np.argsort(pytorch_logits)[-10:][::-1]
    llamacpp_top10 = np.argsort(llamacpp_logits)[-10:][::-1]
    print(f"Top 10 PyTorch logits: {pytorch_logits[pytorch_top10]}")
    print(f"Top 10 llama.cpp logits: {llamacpp_logits[llamacpp_top10]}")
    print(f"Max absolute difference: {max_diff:.4f}")

    return True

main() file discovery logic:

def main():
    model_path = os.environ.get('MODEL_PATH')
    model_name = get_model_name_from_env_path('MODEL_PATH')
    data_dir = Path("data")
    pytorch_file = data_dir / f"pytorch-{model_name}.bin"

    llamacpp_model_name = get_model_name_from_env_path('CONVERTED_MODEL')
    llamacpp_file = data_dir / f"llamacpp-{llamacpp_model_name}.bin"

Import

import sys
import numpy as np
from pathlib import Path
import os
from common import get_model_name_from_env_path, compare_tokens, exit_with_warning

I/O Contract

Direction	Type	Description
Input (env)	`MODEL_PATH`	Path to the original PyTorch model directory
Input (env)	`CONVERTED_MODEL`	Path to the converted GGUF model file
Input (file)	`data/pytorch-{model_name}.bin`	Raw float32 logits from the PyTorch model (generated by a prior script)
Input (file)	`data/llamacpp-{model_name}.bin`	Raw float32 logits from the llama.cpp model (generated by a prior script)
Input (file)	`data/pytorch-{model_name}-tokens.bin`	int32 token IDs from the PyTorch model
Input (file)	`data/llamacpp-{model_name}-tokens.bin`	int32 token IDs from the llama.cpp model
Output	stdout	Token comparison result, top-10 logit values for both models, max absolute difference
Exit code	`0`	Lightweight check passed; safe to proceed with NMSE
Exit code	`1`	Token mismatch or logit comparison failure

Binary file format:

File	Dtype	Layout	Description
`pytorch-{name}.bin`	`float32`	Flattened 1D array	Last-position logit vector (vocabulary size elements)
`llamacpp-{name}.bin`	`float32`	Flattened 1D array	Last-position logit vector (vocabulary size elements)
`*-tokens.bin`	`int32`	Flattened 1D array	Token IDs for the input prompt

Usage Examples

Full verification workflow:

# Step 1: Generate reference logits from PyTorch model
export MODEL_PATH=./models/SmolLM2-1.7B-Instruct
python examples/model-conversion/scripts/causal/run-org-model.py

# Step 2: Generate logits from converted GGUF model
export CONVERTED_MODEL=./SmolLM2-1.7B-Instruct-f16.gguf
python examples/model-conversion/scripts/causal/run-converted-model.sh

# Step 3: Compare logits
python examples/model-conversion/scripts/causal/compare-logits.py

Expected output on success:

Using converted model: SmolLM2-1.7B-Instruct-f16

Comparing tokens between:
  Original : pytorch-SmolLM2-1.7B-Instruct (42 tokens)
  Converted: llamacpp-SmolLM2-1.7B-Instruct-f16 (42 tokens)

All 42 tokens match!

GGML Model Validation for model  SmolLM2-1.7B-Instruct
========================================
PyTorch logits  : data/pytorch-SmolLM2-1.7B-Instruct.bin
llama.cpp logits: data/llamacpp-SmolLM2-1.7B-Instruct-f16.bin

Top 10 PyTorch logits: [15.234  12.891  11.547 ...]
Top 10 llama.cpp logits: [15.234  12.891  11.547 ...]
Max absolute difference: 0.0312
OK: Lightweight model check successful!
       Ok to proceed with NMSE check...

Expected output on failure (token mismatch):

Token count mismatch: 42 vs 40

Token mismatch detected
=====================================================================
Verification failure might be due to a transformers version mismatch:

Current transformers version: 4.44.0
Model's required version    : 4.57.1

Consider installing the version specified by the model's config:
pip install transformers==4.57.1
=====================================================================

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment