Implementation:Ggml org Ggml Mnist model build
ML_Training Model_Architecture GGML 2025-05-15 12:00 GMT
Summary
Builds the forward computation graph for either a fully-connected or convolutional MNIST model by wiring the previously-initialised weight tensors into a sequence of GGML tensor operations. The resulting graph is used for both inference and (with automatic differentiation) training.
API Signature
void mnist_model_build(mnist_model & model)
Source
- File:
examples/mnist/mnist-common.cpp, lines 312--383 - Repository: https://github.com/ggml-org/ggml
Parameters
| Name | Type | Description |
|---|---|---|
model |
mnist_model & |
Model struct with pre-initialised weight tensors and GGML contexts. Must already contain allocated fc1_weight, fc1_bias, fc2_weight, fc2_bias (FC path) or the corresponding convolutional kernels and dense weights (CNN path).
|
Return Value
void -- the function populates model.logits, a tensor of shape [10, nbatch_physical] representing per-class scores for each image in the physical batch.
Fully-Connected Path
logits = fc2_weight * relu(fc1_weight * images + fc1_bias) + fc2_bias
ggml_mul_mat(fc1_weight, images)-- project input to hidden dimension.ggml_add(..., fc1_bias)-- add first-layer bias.ggml_relu(...)-- apply ReLU activation.ggml_mul_mat(fc2_weight, ...)-- project hidden to 10 output classes.ggml_add(..., fc2_bias)-- add second-layer bias.
CNN Path
reshape(28, 28, 1, batch) -> conv2d(3x3) + ReLU -> maxpool(2x2) -> conv2d(3x3) + ReLU -> maxpool(2x2) -> flatten -> dense -> logits
ggml_reshape_4d(images, 28, 28, 1, batch)-- reshape flat input to spatial layout.ggml_conv_2d(kernel1, ...)followed byggml_relu-- first convolutional block.ggml_pool_2d(..., GGML_OP_POOL_MAX, 2, 2, ...)-- 2x2 max-pooling.ggml_conv_2d(kernel2, ...)followed byggml_relu-- second convolutional block.ggml_pool_2d(..., GGML_OP_POOL_MAX, 2, 2, ...)-- second 2x2 max-pooling.ggml_permute/ggml_cont/ggml_reshape_2d-- flatten spatial dimensions.ggml_mul_mat(dense_weight, ...) + dense_bias-- final dense projection to 10 classes.
Parameter Registration
All weight tensors are marked as trainable parameters via ggml_set_param(), enabling the automatic differentiation engine to compute gradients during the backward pass.
GGML Operations Used
ggml_set_param-- mark tensor as trainable parameterggml_relu-- element-wise ReLU activationggml_add-- element-wise tensor addition (bias)ggml_mul_mat-- matrix multiplication (dense layers)ggml_reshape_4d-- reshape to 4-D spatial layoutggml_conv_2d-- 2-D convolution with learned kernelsggml_pool_2d-- 2-D max-poolingggml_reshape_2d-- flatten back to 2-D for dense layerggml_cont-- ensure contiguous memory layoutggml_permute-- transpose / reorder dimensionsggml_set_name-- assign human-readable name to tensorggml_set_output-- mark tensor as graph output
Language
C++