Implementation:Ggml org Ggml Ggml build forward expand
| Knowledge Sources | |
|---|---|
| Domains | ML_Infrastructure, Tensor_Computing |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Concrete function for building a computation graph (DAG) from an output tensor by recursively walking its dependencies.
Description
ggml_build_forward_expand is a C function in the GGML tensor library that takes a computation graph and an output tensor, then performs a recursive dependency walk from that tensor back through all of its sources. Every tensor encountered during the walk is classified as either a node (a tensor produced by an operation such as GGML_OP_MUL_MAT or GGML_OP_ADD) or a leaf (a raw input or weight tensor with no operation). The walk populates the graph's internal nodes and leafs arrays in topologically-sorted order, producing a directed acyclic graph (DAG) that is ready for execution by any GGML backend.
The function is designed to be called multiple times on the same graph with different output tensors, allowing users to incrementally build complex, multi-output graphs.
Usage
Call this function after composing tensor operations with helpers like ggml_mul_mat and ggml_add, passing the final result tensor. The populated graph can then be dispatched to a backend via ggml_backend_graph_compute or similar routines.
Code Reference
Source Location
- Repository: GGML
- File: src/ggml.c
- Lines: 6830-6832
Signature
void ggml_build_forward_expand(struct ggml_cgraph * cgraph, struct ggml_tensor * tensor);
Key Operations Used to Build the Graph
The following operation helpers create tensor nodes that ggml_build_forward_expand later collects into the DAG:
| Operation | Signature | File | Lines |
|---|---|---|---|
| ggml_mul_mat | struct ggml_tensor * ggml_mul_mat(struct ggml_context * ctx, struct ggml_tensor * a, struct ggml_tensor * b); |
src/ggml.c | 3176-3191 |
| ggml_add | struct ggml_tensor * ggml_add(struct ggml_context * ctx, struct ggml_tensor * a, struct ggml_tensor * b); |
src/ggml.c | 1971-1976 |
Dependencies
- ggml.h — Public API header providing all type definitions and function declarations.
Import
#include "ggml.h"
Language
- C
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cgraph | struct ggml_cgraph * |
Yes | The computation graph to populate. Must have been previously allocated (e.g., via ggml_new_graph). |
| tensor | struct ggml_tensor * |
Yes | The output tensor whose full dependency tree will be walked and added to the graph. |
Outputs
| Name | Type | Description |
|---|---|---|
| (none — mutates cgraph) | void | The function populates cgraph->nodes with operation tensors and cgraph->leafs with input/weight tensors, both in topologically-sorted order. |
Usage Examples
The following example builds a simple matmul + bias add computation graph — one of the most common patterns in neural-network inference — and then executes it.
#include "ggml.h"
#include "ggml-backend.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
// -----------------------------------------------------------
// 1. Initialize a GGML context with enough memory for tensors
// -----------------------------------------------------------
struct ggml_init_params params = {
.mem_size = 64 * 1024 * 1024, // 64 MiB workspace
.mem_buffer = NULL, // let GGML allocate
.no_alloc = false,
};
struct ggml_context * ctx = ggml_init(params);
// -----------------------------------------------------------
// 2. Create input and weight tensors (leafs in the graph)
// -----------------------------------------------------------
// Input matrix X: 4 rows x 3 columns
struct ggml_tensor * X = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 3, 4);
ggml_set_name(X, "X");
// Weight matrix W: 3 columns x 2 output features
struct ggml_tensor * W = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 3, 2);
ggml_set_name(W, "W");
// Bias vector b: 2 elements (one per output feature)
struct ggml_tensor * b = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 2);
ggml_set_name(b, "b");
// -----------------------------------------------------------
// 3. Fill tensors with sample data
// -----------------------------------------------------------
float * X_data = (float *) X->data;
for (int i = 0; i < 4 * 3; i++) { X_data[i] = (float)(i + 1); }
float * W_data = (float *) W->data;
for (int i = 0; i < 2 * 3; i++) { W_data[i] = 0.5f; }
float * b_data = (float *) b->data;
b_data[0] = 1.0f;
b_data[1] = 2.0f;
// -----------------------------------------------------------
// 4. Define operations — this does NOT execute anything yet.
// Each call returns a new tensor node with an operation type
// and pointers back to its source tensors.
// -----------------------------------------------------------
// Matrix multiply: result = X @ W^T (shape 4x2)
// Internally tagged as GGML_OP_MUL_MAT
struct ggml_tensor * mul_result = ggml_mul_mat(ctx, W, X);
ggml_set_name(mul_result, "XW");
// Element-wise bias addition: output = result + b
// Internally tagged as GGML_OP_ADD
struct ggml_tensor * output = ggml_add(ctx, mul_result, b);
ggml_set_name(output, "output");
// -----------------------------------------------------------
// 5. Build the computation graph (DAG)
// ggml_build_forward_expand walks backwards from 'output',
// discovers mul_result -> {W, X} and output -> {mul_result, b},
// and records everything in topological order.
// -----------------------------------------------------------
struct ggml_cgraph * graph = ggml_new_graph(ctx);
ggml_build_forward_expand(graph, output);
// At this point the graph contains:
// leafs: [X, W, b] (input/weight tensors, GGML_OP_NONE)
// nodes: [mul_result, output] (operation tensors, in dependency order)
printf("Graph built: %d nodes, %d leafs\n",
ggml_graph_n_nodes(graph), ggml_graph_n_leafs(graph));
// -----------------------------------------------------------
// 6. Execute the graph on the CPU backend
// -----------------------------------------------------------
ggml_graph_compute_with_ctx(ctx, graph, /* n_threads = */ 4);
// -----------------------------------------------------------
// 7. Read back the results
// -----------------------------------------------------------
float * out_data = (float *) output->data;
printf("Output (4x2):\n");
for (int row = 0; row < 4; row++) {
for (int col = 0; col < 2; col++) {
printf(" %8.2f", out_data[row * 2 + col]);
}
printf("\n");
}
// -----------------------------------------------------------
// 8. Cleanup
// -----------------------------------------------------------
ggml_free(ctx);
return 0;
}
What the graph looks like:
X (leaf) W (leaf) b (leaf)
\ / |
\ / |
ggml_mul_mat |
[GGML_OP_MUL_MAT] |
\ |
\ /
ggml_add /
[GGML_OP_ADD] ----------
|
output (node)
The call to ggml_build_forward_expand(graph, output) walks this tree from output upward, visiting mul_result (and its sources W and X) and b. It inserts each tensor exactly once into either the nodes or leafs array, in an order that respects all data dependencies.