Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Llama cpp Semantic Check

From Leeroopedia
Knowledge Sources
Domains Model_Conversion, Verification
Last Updated 2026-02-15 00:00 GMT

Overview

Performs a detailed semantic similarity comparison between PyTorch and llama.cpp embedding outputs to validate conversion quality.

Description

Loads binary embedding files from both models, first verifies token consistency, then performs multi-level analysis: raw magnitude comparison per token, within-model token similarity matrices, cross-model same-token cosine similarities, and similarity matrix difference metrics (max, mean, RMS). For pooled embeddings, compares single sentence-level vectors. Provides a quality assessment from "EXCELLENT" (>0.95) to "POOR" (<0.70) and exits with a warning on failure, including transformers version mismatch diagnostics.

Usage

Use this as the primary semantic validation tool for embedding model conversions, providing detailed diagnostic output to identify and debug conversion issues.

Code Reference

Source Location

  • Repository: Ggml_org_Llama_cpp
  • File: examples/model-conversion/scripts/utils/semantic_check.py
  • Lines: 1-242

Signature

def cosine_similarity(a, b=None)
def load_embeddings_from_file(filename, n_tokens, n_embd)
def test_single_prompt_similarity(python_emb, cpp_emb, tokens, prompt)
def read_prompt_from_file(prompt_file)
def main()

Import

import numpy as np
import argparse
import os
import importlib
from pathlib import Path
from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM, AutoModel
from common import compare_tokens, exit_with_warning

I/O Contract

Inputs

Name Type Required Description
-m / --model-path str Yes Path to the model directory
MODEL_PATH env var Yes Environment variable pointing to the PyTorch model path
CONVERTED_MODEL env var Yes Environment variable pointing to the converted llama.cpp model path
pytorch-{name}.bin file Yes Binary file containing PyTorch embedding outputs
llamacpp-{name}.bin file Yes Binary file containing llama.cpp embedding outputs
tokens list Yes (for test_single_prompt_similarity) Token IDs for alignment verification
prompt str Yes (for test_single_prompt_similarity) Original prompt text

Outputs

Name Type Description
stdout text Detailed comparison report: magnitude ratios, similarity matrices, cross-model cosine similarities, and quality assessment
exit code int 0 on success, 1 on failure (similarity below threshold)
test_single_prompt_similarity return dict Dictionary with cross_model_similarities, similarity_matrix_diff, max_diff, mean_diff, rms_diff

Usage Examples

# Run semantic check from the model conversion scripts directory
export MODEL_PATH=/path/to/pytorch/model
export CONVERTED_MODEL=/path/to/converted/model.gguf

python semantic_check.py -m /path/to/model

# Programmatic usage
from semantic_check import cosine_similarity, load_embeddings_from_file

embeddings = load_embeddings_from_file("embeddings.bin", n_tokens=10, n_embd=768)
sim_matrix = cosine_similarity(embeddings)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment