Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Magika inference

From Leeroopedia


Implementation Metadata
File Name examples/magika/main.cpp
Repository ggml-org/ggml
Lines 374
Language C++
Domain Tags ML_Inference, File_Classification, Example
Status Active
Last Updated 2025-05-15 12:00 GMT
Knowledge Sources ggml-org/ggml repository

Overview

examples/magika/main.cpp is a C++ implementation of Google Magika file type detection using GGML for inference. It demonstrates a practical real-world GGML inference application that performs file content classification using a pre-trained neural network, showcasing GGUF model loading, graph construction, and batch inference.

Description

The file defines 113 file type labels (from "ai" to "zlibstream") and the following structures:

  • magika_hparams -- Model hyperparameters: block_size=4096, beg_size=512, mid_size=512, end_size=512, min_file_size_for_dl=16, n_label=113, f_norm_eps=0.001, padding_token=256
  • magika_model -- Model with dense layers, layer normalization, and target label output layers

The inference pipeline:

  1. Loads model weights from GGUF format via gguf_init_from_file
  2. Reads file content and extracts beginning, middle, and end byte segments
  3. Constructs a compute graph with dense layers, layer normalization, and softmax
  4. Runs inference via ggml_backend_graph_compute
  5. Returns the highest-scoring file type label

Usage

# Build the magika example
cmake -B build
cmake --build build --target magika

# Run file type detection
./build/bin/magika -m magika.gguf input_file.bin

Code Reference

Source Location

Repository File Lines
ggml-org/ggml examples/magika/main.cpp 374

Key Signatures

static const char * magika_labels[] = {
    "ai", "apk", "appleplist", "asm", "asp", ...  // 113 labels
};

struct magika_hparams {
    const int block_size = 4096;
    const int beg_size = 512;
    const int mid_size = 512;
    const int end_size = 512;
    const int n_label = 113;
    const float f_norm_eps = 0.001f;
    const int padding_token = 256;
};

struct magika_model {
    struct ggml_tensor * dense_w, * dense_b;
    struct ggml_tensor * layer_norm_gamma, * layer_norm_beta;
    struct ggml_tensor * dense_1_w, * dense_1_b;
    struct ggml_tensor * dense_2_w, * dense_2_b;
    struct ggml_tensor * target_label_w, * target_label_b;
    ggml_backend_t backend = ggml_backend_cpu_init();
};

bool magika_model_load(const std::string & fname, magika_model & model);
struct ggml_tensor * checked_get_tensor(struct ggml_context * ctx, const char * name);

I/O Contract

Inputs

  • Model file -- GGUF-format Magika model weights
  • Input file -- Any file to classify (reads first/middle/last 512 bytes)

Outputs

  • File type label -- One of 113 content type labels (e.g., "pdf", "jpeg", "python")
  • Confidence score -- Softmax probability for the predicted label

Usage Examples

File type detection:

#include "ggml.h"
#include "gguf.h"

// Load model from GGUF
magika_model model;
magika_model_load("magika.gguf", model);

// Classify a file
// The model processes beginning (512 bytes), middle (512 bytes),
// and end (512 bytes) segments of the input file
// Returns label like "python", "jpeg", "pdf", etc.

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment