Implementation:Ollama Ollama Llama Model Grok
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Model Architecture |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the ggml computation graph builder for the Grok model architecture from xAI.
Description
The llm_build_grok constructor builds a transformer with RoPE-based positional encoding, RMS-normalized self-attention with Q/K/V projections and optional biases, and Mixture-of-Experts feed-forward layers across all transformer blocks. Produces final logits via an output projection for autoregressive generation.
Usage
Enables Ollama to run xAI Grok models through the llama.cpp inference engine by defining how the model's MoE architecture maps to ggml tensor operations.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/models/grok.cpp - Lines: 1-159
Signature
llm_build_grok::llm_build_grok(
const llama_model & model,
const llm_graph_params & params) : llm_graph_context(params);
Import
#include "models.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | const llama_model & | Yes | Loaded model with Grok MoE weights |
| params | const llm_graph_params & | Yes | Graph construction parameters |
Outputs
| Name | Type | Description |
|---|---|---|
| ggml graph | ggml_cgraph | Complete Grok MoE computation graph |
Usage Examples
auto builder = llm_build_grok(model, params);