Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml_org_Ggml_Ggml_opt_fit

From Leeroopedia


Template:Implementation

Summary

The ggml_opt_fit function is the high-level training API in GGML that executes a complete training run: it constructs an optimizer context, splits the dataset into training and validation portions, and iterates over epochs by delegating to ggml_opt_epoch. It wraps the full training loop -- including dataset shuffling, forward and backward passes on training batches, forward-only evaluation on validation batches, gradient accumulation, and progress reporting -- into a single function call.

Import

#include "ggml-opt.h"

Dependencies

  • ggml-opt.h -- public header defining the optimization API, including ggml_opt_fit, ggml_opt_epoch, dataset types, loss types, and optimizer types.
  • ggml.h -- core GGML header providing tensor types, computation graph primitives, and context management.

Function Signature

void ggml_opt_fit(
    ggml_backend_sched_t              backend_sched,
    struct ggml_context             * ctx_compute,
    struct ggml_tensor              * inputs,
    struct ggml_tensor              * outputs,
    ggml_opt_dataset_t                dataset,
    enum ggml_opt_loss_type           loss_type,
    enum ggml_opt_optimizer_type      optimizer,
    ggml_opt_get_optimizer_params     get_opt_pars,
    int64_t                           nepoch,
    int64_t                           nbatch_logical,
    float                             val_split,
    bool                              silent);

Source: src/ggml-opt.cpp:L998-1078

Parameters

Parameter Type Description
backend_sched ggml_backend_sched_t Backend scheduler handle that manages device selection and graph execution across one or more backends.
ctx_compute struct ggml_context * GGML context used for allocating intermediate computation tensors during forward and backward passes.
inputs struct ggml_tensor * Input tensor that will receive batches of training data. Its shape must match the dataset's data-point dimensionality.
outputs struct ggml_tensor * Output tensor representing the model's predictions. Connected to the loss computation graph.
dataset ggml_opt_dataset_t The dataset object (created via ggml_opt_dataset_init) containing all training samples and labels.
loss_type enum ggml_opt_loss_type The loss function to use for training. Typical value: GGML_OPT_LOSS_TYPE_CROSS_ENTROPY for classification tasks.
optimizer enum ggml_opt_optimizer_type The optimization algorithm. Typical value: GGML_OPT_OPTIMIZER_TYPE_ADAMW (Adam with decoupled weight decay).
get_opt_pars ggml_opt_get_optimizer_params Callback function that returns custom optimizer parameters (learning rate, beta values, etc.). Can be NULL to use defaults.
nepoch int64_t Number of training epochs. Each epoch processes the entire training split once.
nbatch_logical int64_t Logical batch size for gradient accumulation. If larger than the physical shard size, gradients are accumulated over multiple shards before each parameter update.
val_split float Fraction of the dataset to reserve for validation (e.g., 0.05 for 5%). Must be in the range [0.0, 1.0). A value of 0 disables validation.
silent bool When true, suppresses progress bars and per-epoch statistics output.

Return Value

This function returns void. The model parameters are updated in-place through the tensors referenced by the computation graph. Training progress (loss, accuracy) is printed to standard output unless silent is true.

Internal Workflow

ggml_opt_fit orchestrates the full training run through the following steps:

  1. Build optimization context -- Allocates an ggml_opt_t context configured with the specified loss type, optimizer, logical batch size, and optimizer parameter callback.
  2. Compute split sizes -- Uses val_split to partition the dataset into training samples and validation samples. Sizes are rounded to shard boundaries.
  3. Epoch loop -- For each epoch from 1 to nepoch:
    1. Calls ggml_opt_epoch to execute one full training epoch.
    2. ggml_opt_epoch shuffles the dataset shards, iterates over training shards (forward + backward + accumulate + update), then iterates over validation shards (forward only).
    3. Collects and reports training loss, training accuracy, validation loss, and validation accuracy.
  4. Cleanup -- Frees the optimization context and any temporary allocations.

Lower-Level API: ggml_opt_epoch

The per-epoch logic is handled by ggml_opt_epoch, which ggml_opt_fit calls internally:

void ggml_opt_epoch(
    ggml_opt_context_t              opt_ctx,
    ggml_opt_dataset_t              dataset,
    struct ggml_tensor            * inputs,
    struct ggml_tensor            * outputs,
    ggml_opt_epoch_callback         callback_train,
    ggml_opt_epoch_callback         callback_eval);

Source: src/ggml-opt.cpp:L880-923

For each epoch, ggml_opt_epoch performs:

  • Shuffle the dataset shard order.
  • Training iteration -- For each training shard: load data into input/output tensors, run the forward and backward computation graphs, accumulate gradients, and update parameters when a logical batch boundary is reached.
  • Validation iteration -- For each validation shard: load data, run forward-only computation, and accumulate loss/accuracy metrics without updating weights.

Usage Example: MNIST Training

The MNIST example provides a higher-level wrapper that demonstrates typical usage of ggml_opt_fit:

void mnist_model_train(
    mnist_model      & model,
    ggml_opt_dataset_t dataset,
    const int          nepoch,
    const float        val_split);

Source: examples/mnist/mnist-common.cpp:L412-415

This function calls ggml_opt_fit with the model's backend scheduler, compute context, input and output tensors, and the provided training parameters. It uses GGML_OPT_LOSS_TYPE_CROSS_ENTROPY as the loss type and GGML_OPT_OPTIMIZER_TYPE_ADAMW as the optimizer:

#include "ggml-opt.h"
#include "mnist-common.h"

// Assuming model and dataset are already initialized:
mnist_model_train(model, dataset, /*nepoch=*/30, /*val_split=*/0.05f);

// Internally this calls:
// ggml_opt_fit(
//     model.backend_sched,
//     model.ctx_compute,
//     model.inputs,
//     model.outputs,
//     dataset,
//     GGML_OPT_LOSS_TYPE_CROSS_ENTROPY,
//     GGML_OPT_OPTIMIZER_TYPE_ADAMW,
//     NULL,       // get_opt_pars: use defaults
//     30,         // nepoch
//     512,        // nbatch_logical
//     0.05f,      // val_split: 5% for validation
//     false);     // silent: show progress

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment