Implementation:Ggml org Llama cpp Compare Logits
| Field | Value |
|---|---|
| Implementation Name | Compare Logits |
| Type | Wrapper Doc |
| Wraps | NumPy-based logit comparison between PyTorch and llama.cpp outputs |
| Status | Active |
Overview
Description
The compare-logits.py script provides a lightweight sanity check for verifying that a converted GGUF model produces logits consistent with the original PyTorch model. It is part of the llama.cpp model conversion verification toolkit, located at examples/model-conversion/scripts/causal/compare-logits.py.
The script performs two checks in sequence:
- Token comparison: Verifies that both models produced identical token sequences for the same input, using the
compare_tokens()utility fromcommon.py. - Logit comparison: Loads pre-computed logit vectors from binary files, checks shape compatibility, computes the maximum absolute difference, and compares the top-10 predictions from both models.
The core comparison logic is in the quick_logits_check() function (lines 12-39). The main() function (lines 41-87) handles file discovery via environment variables and orchestrates both checks.
Usage
The script expects two environment variables to be set:
export MODEL_PATH=/path/to/original/pytorch/model
export CONVERTED_MODEL=/path/to/converted/model.gguf
python examples/model-conversion/scripts/causal/compare-logits.py
Prerequisites: logit binary files must be pre-generated by running the original model and the converted model on the same prompt, producing files in the data/ directory.
Code Reference
Source Location
| File | Lines | Description |
|---|---|---|
examples/model-conversion/scripts/causal/compare-logits.py |
12-39 | quick_logits_check() function
|
examples/model-conversion/scripts/causal/compare-logits.py |
41-87 | main() function
|
examples/model-conversion/scripts/utils/common.py |
208-248 | compare_tokens() utility
|
examples/model-conversion/scripts/utils/common.py |
280-299 | exit_with_warning() utility with version check
|
Signature
quick_logits_check():
def quick_logits_check(pytorch_file, llamacpp_file):
"""Lightweight sanity check before NMSE"""
try:
pytorch_logits = np.fromfile(pytorch_file, dtype=np.float32)
llamacpp_logits = np.fromfile(llamacpp_file, dtype=np.float32)
except Exception as e:
print(f"Failed to load files - {e}")
return False
# Check shapes match
if pytorch_logits.shape != llamacpp_logits.shape:
print(f"Shape mismatch - PyTorch: {pytorch_logits.shape}, llama.cpp: {llamacpp_logits.shape}")
return False
# Calculate key metrics
diff = pytorch_logits - llamacpp_logits
abs_diff = np.abs(diff)
max_diff = np.max(abs_diff)
# Get top 10 predictions from both models
pytorch_top10 = np.argsort(pytorch_logits)[-10:][::-1]
llamacpp_top10 = np.argsort(llamacpp_logits)[-10:][::-1]
print(f"Top 10 PyTorch logits: {pytorch_logits[pytorch_top10]}")
print(f"Top 10 llama.cpp logits: {llamacpp_logits[llamacpp_top10]}")
print(f"Max absolute difference: {max_diff:.4f}")
return True
main() file discovery logic:
def main():
model_path = os.environ.get('MODEL_PATH')
model_name = get_model_name_from_env_path('MODEL_PATH')
data_dir = Path("data")
pytorch_file = data_dir / f"pytorch-{model_name}.bin"
llamacpp_model_name = get_model_name_from_env_path('CONVERTED_MODEL')
llamacpp_file = data_dir / f"llamacpp-{llamacpp_model_name}.bin"
Import
import sys
import numpy as np
from pathlib import Path
import os
from common import get_model_name_from_env_path, compare_tokens, exit_with_warning
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input (env) | MODEL_PATH |
Path to the original PyTorch model directory |
| Input (env) | CONVERTED_MODEL |
Path to the converted GGUF model file |
| Input (file) | data/pytorch-{model_name}.bin |
Raw float32 logits from the PyTorch model (generated by a prior script) |
| Input (file) | data/llamacpp-{model_name}.bin |
Raw float32 logits from the llama.cpp model (generated by a prior script) |
| Input (file) | data/pytorch-{model_name}-tokens.bin |
int32 token IDs from the PyTorch model |
| Input (file) | data/llamacpp-{model_name}-tokens.bin |
int32 token IDs from the llama.cpp model |
| Output | stdout | Token comparison result, top-10 logit values for both models, max absolute difference |
| Exit code | 0 |
Lightweight check passed; safe to proceed with NMSE |
| Exit code | 1 |
Token mismatch or logit comparison failure |
Binary file format:
| File | Dtype | Layout | Description |
|---|---|---|---|
pytorch-{name}.bin |
float32 |
Flattened 1D array | Last-position logit vector (vocabulary size elements) |
llamacpp-{name}.bin |
float32 |
Flattened 1D array | Last-position logit vector (vocabulary size elements) |
*-tokens.bin |
int32 |
Flattened 1D array | Token IDs for the input prompt |
Usage Examples
Full verification workflow:
# Step 1: Generate reference logits from PyTorch model
export MODEL_PATH=./models/SmolLM2-1.7B-Instruct
python examples/model-conversion/scripts/causal/run-org-model.py
# Step 2: Generate logits from converted GGUF model
export CONVERTED_MODEL=./SmolLM2-1.7B-Instruct-f16.gguf
python examples/model-conversion/scripts/causal/run-converted-model.sh
# Step 3: Compare logits
python examples/model-conversion/scripts/causal/compare-logits.py
Expected output on success:
Using converted model: SmolLM2-1.7B-Instruct-f16
Comparing tokens between:
Original : pytorch-SmolLM2-1.7B-Instruct (42 tokens)
Converted: llamacpp-SmolLM2-1.7B-Instruct-f16 (42 tokens)
All 42 tokens match!
GGML Model Validation for model SmolLM2-1.7B-Instruct
========================================
PyTorch logits : data/pytorch-SmolLM2-1.7B-Instruct.bin
llama.cpp logits: data/llamacpp-SmolLM2-1.7B-Instruct-f16.bin
Top 10 PyTorch logits: [15.234 12.891 11.547 ...]
Top 10 llama.cpp logits: [15.234 12.891 11.547 ...]
Max absolute difference: 0.0312
OK: Lightweight model check successful!
Ok to proceed with NMSE check...
Expected output on failure (token mismatch):
Token count mismatch: 42 vs 40
Token mismatch detected
=====================================================================
Verification failure might be due to a transformers version mismatch:
Current transformers version: 4.44.0
Model's required version : 4.57.1
Consider installing the version specified by the model's config:
pip install transformers==4.57.1
=====================================================================