Implementation:Ggml org Ggml Ggml build forward expand

Knowledge Sources	GGML
Domains	ML_Infrastructure, Tensor_Computing
Last Updated	2025-05-15 12:00 GMT

Overview

Concrete function for building a computation graph (DAG) from an output tensor by recursively walking its dependencies.

Description

ggml_build_forward_expand is a C function in the GGML tensor library that takes a computation graph and an output tensor, then performs a recursive dependency walk from that tensor back through all of its sources. Every tensor encountered during the walk is classified as either a node (a tensor produced by an operation such as GGML_OP_MUL_MAT or GGML_OP_ADD) or a leaf (a raw input or weight tensor with no operation). The walk populates the graph's internal nodes and leafs arrays in topologically-sorted order, producing a directed acyclic graph (DAG) that is ready for execution by any GGML backend.

The function is designed to be called multiple times on the same graph with different output tensors, allowing users to incrementally build complex, multi-output graphs.

Usage

Call this function after composing tensor operations with helpers like ggml_mul_mat and ggml_add, passing the final result tensor. The populated graph can then be dispatched to a backend via ggml_backend_graph_compute or similar routines.

Code Reference

Source Location

Repository: GGML
File: src/ggml.c
Lines: 6830-6832

Signature

void ggml_build_forward_expand(struct ggml_cgraph * cgraph, struct ggml_tensor * tensor);

Key Operations Used to Build the Graph

The following operation helpers create tensor nodes that ggml_build_forward_expand later collects into the DAG:

Operation	Signature	File	Lines
ggml_mul_mat	`struct ggml_tensor * ggml_mul_mat(struct ggml_context * ctx, struct ggml_tensor * a, struct ggml_tensor * b);`	src/ggml.c	3176-3191
ggml_add	`struct ggml_tensor * ggml_add(struct ggml_context * ctx, struct ggml_tensor * a, struct ggml_tensor * b);`	src/ggml.c	1971-1976

Dependencies

ggml.h — Public API header providing all type definitions and function declarations.

Import

#include "ggml.h"

Language

C

I/O Contract

Inputs

Name	Type	Required	Description
cgraph	`struct ggml_cgraph *`	Yes	The computation graph to populate. Must have been previously allocated (e.g., via ggml_new_graph).
tensor	`struct ggml_tensor *`	Yes	The output tensor whose full dependency tree will be walked and added to the graph.

Outputs

Name	Type	Description
(none — mutates cgraph)	void	The function populates cgraph->nodes with operation tensors and cgraph->leafs with input/weight tensors, both in topologically-sorted order.

Usage Examples

The following example builds a simple matmul + bias add computation graph — one of the most common patterns in neural-network inference — and then executes it.

#include "ggml.h"
#include "ggml-backend.h"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    // -----------------------------------------------------------
    // 1. Initialize a GGML context with enough memory for tensors
    // -----------------------------------------------------------
    struct ggml_init_params params = {
        .mem_size   = 64 * 1024 * 1024,  // 64 MiB workspace
        .mem_buffer = NULL,               // let GGML allocate
        .no_alloc   = false,
    };
    struct ggml_context * ctx = ggml_init(params);

    // -----------------------------------------------------------
    // 2. Create input and weight tensors (leafs in the graph)
    // -----------------------------------------------------------
    // Input matrix X: 4 rows x 3 columns
    struct ggml_tensor * X = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 3, 4);
    ggml_set_name(X, "X");

    // Weight matrix W: 3 columns x 2 output features
    struct ggml_tensor * W = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 3, 2);
    ggml_set_name(W, "W");

    // Bias vector b: 2 elements (one per output feature)
    struct ggml_tensor * b = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 2);
    ggml_set_name(b, "b");

    // -----------------------------------------------------------
    // 3. Fill tensors with sample data
    // -----------------------------------------------------------
    float * X_data = (float *) X->data;
    for (int i = 0; i < 4 * 3; i++) { X_data[i] = (float)(i + 1); }

    float * W_data = (float *) W->data;
    for (int i = 0; i < 2 * 3; i++) { W_data[i] = 0.5f; }

    float * b_data = (float *) b->data;
    b_data[0] = 1.0f;
    b_data[1] = 2.0f;

    // -----------------------------------------------------------
    // 4. Define operations — this does NOT execute anything yet.
    //    Each call returns a new tensor node with an operation type
    //    and pointers back to its source tensors.
    // -----------------------------------------------------------

    // Matrix multiply: result = X @ W^T  (shape 4x2)
    // Internally tagged as GGML_OP_MUL_MAT
    struct ggml_tensor * mul_result = ggml_mul_mat(ctx, W, X);
    ggml_set_name(mul_result, "XW");

    // Element-wise bias addition: output = result + b
    // Internally tagged as GGML_OP_ADD
    struct ggml_tensor * output = ggml_add(ctx, mul_result, b);
    ggml_set_name(output, "output");

    // -----------------------------------------------------------
    // 5. Build the computation graph (DAG)
    //    ggml_build_forward_expand walks backwards from 'output',
    //    discovers mul_result -> {W, X} and output -> {mul_result, b},
    //    and records everything in topological order.
    // -----------------------------------------------------------
    struct ggml_cgraph * graph = ggml_new_graph(ctx);
    ggml_build_forward_expand(graph, output);

    // At this point the graph contains:
    //   leafs:  [X, W, b]             (input/weight tensors, GGML_OP_NONE)
    //   nodes:  [mul_result, output]   (operation tensors, in dependency order)
    printf("Graph built: %d nodes, %d leafs\n",
           ggml_graph_n_nodes(graph), ggml_graph_n_leafs(graph));

    // -----------------------------------------------------------
    // 6. Execute the graph on the CPU backend
    // -----------------------------------------------------------
    ggml_graph_compute_with_ctx(ctx, graph, /* n_threads = */ 4);

    // -----------------------------------------------------------
    // 7. Read back the results
    // -----------------------------------------------------------
    float * out_data = (float *) output->data;
    printf("Output (4x2):\n");
    for (int row = 0; row < 4; row++) {
        for (int col = 0; col < 2; col++) {
            printf("  %8.2f", out_data[row * 2 + col]);
        }
        printf("\n");
    }

    // -----------------------------------------------------------
    // 8. Cleanup
    // -----------------------------------------------------------
    ggml_free(ctx);
    return 0;
}

What the graph looks like:

       X (leaf)     W (leaf)         b (leaf)
         \           /                  |
          \         /                   |
       ggml_mul_mat                     |
       [GGML_OP_MUL_MAT]               |
              \                         |
               \                       /
                ggml_add              /
             [GGML_OP_ADD] ----------
                   |
               output (node)

The call to ggml_build_forward_expand(graph, output) walks this tree from output upward, visiting mul_result (and its sources W and X) and b. It inserts each tensor exactly once into either the nodes or leafs array, in an order that respects all data dependencies.

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment