Implementation:Ggml org Ggml Magika inference
Appearance
| File Name | examples/magika/main.cpp
|
| Repository | ggml-org/ggml |
| Lines | 374 |
| Language | C++ |
| Domain Tags | ML_Inference, File_Classification, Example |
| Status | Active |
| Last Updated | 2025-05-15 12:00 GMT |
| Knowledge Sources | ggml-org/ggml repository |
Overview
examples/magika/main.cpp is a C++ implementation of Google Magika file type detection using GGML for inference. It demonstrates a practical real-world GGML inference application that performs file content classification using a pre-trained neural network, showcasing GGUF model loading, graph construction, and batch inference.
Description
The file defines 113 file type labels (from "ai" to "zlibstream") and the following structures:
magika_hparams-- Model hyperparameters:block_size=4096,beg_size=512,mid_size=512,end_size=512,min_file_size_for_dl=16,n_label=113,f_norm_eps=0.001,padding_token=256magika_model-- Model with dense layers, layer normalization, and target label output layers
The inference pipeline:
- Loads model weights from GGUF format via
gguf_init_from_file - Reads file content and extracts beginning, middle, and end byte segments
- Constructs a compute graph with dense layers, layer normalization, and softmax
- Runs inference via
ggml_backend_graph_compute - Returns the highest-scoring file type label
Usage
# Build the magika example cmake -B build cmake --build build --target magika # Run file type detection ./build/bin/magika -m magika.gguf input_file.bin
Code Reference
Source Location
| Repository | File | Lines |
|---|---|---|
| ggml-org/ggml | examples/magika/main.cpp |
374 |
Key Signatures
static const char * magika_labels[] = {
"ai", "apk", "appleplist", "asm", "asp", ... // 113 labels
};
struct magika_hparams {
const int block_size = 4096;
const int beg_size = 512;
const int mid_size = 512;
const int end_size = 512;
const int n_label = 113;
const float f_norm_eps = 0.001f;
const int padding_token = 256;
};
struct magika_model {
struct ggml_tensor * dense_w, * dense_b;
struct ggml_tensor * layer_norm_gamma, * layer_norm_beta;
struct ggml_tensor * dense_1_w, * dense_1_b;
struct ggml_tensor * dense_2_w, * dense_2_b;
struct ggml_tensor * target_label_w, * target_label_b;
ggml_backend_t backend = ggml_backend_cpu_init();
};
bool magika_model_load(const std::string & fname, magika_model & model);
struct ggml_tensor * checked_get_tensor(struct ggml_context * ctx, const char * name);
I/O Contract
Inputs
- Model file -- GGUF-format Magika model weights
- Input file -- Any file to classify (reads first/middle/last 512 bytes)
Outputs
- File type label -- One of 113 content type labels (e.g., "pdf", "jpeg", "python")
- Confidence score -- Softmax probability for the predicted label
Usage Examples
File type detection:
#include "ggml.h"
#include "gguf.h"
// Load model from GGUF
magika_model model;
magika_model_load("magika.gguf", model);
// Classify a file
// The model processes beginning (512 bytes), middle (512 bytes),
// and end (512 bytes) segments of the input file
// Returns label like "python", "jpeg", "pdf", etc.
Related Pages
Implements Principle
Related Implementations
- Implementation:Ggml_org_Ggml_Ggml_init -- Context initialization
- Implementation:Ggml_org_Ggml_Gguf_init_empty -- GGUF format handling
- Implementation:Ggml_org_Ggml_Ggml_build_forward_expand -- Graph construction
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment