Implementation:Ollama Ollama Llama Model RWKV6

Knowledge Sources	Ollama
Domains	LLM Inference, Model Architecture
Last Updated	2025-02-15 00:00 GMT

Overview

Implements the base class for RWKV6-family model graph builders, providing the shared channel mix and time mix layer implementations.

Description

The llm_build_rwkv6_base class provides two key methods: build_rwkv6_channel_mix constructs the channel mixing layer using learned interpolation (lerp) between current and previous tokens, with receptance-gated output and squared ReLU activation. build_rwkv6_time_mix constructs the more complex time mixing layer, which uses a 5-way learned interpolation (for w, k, v, r, g parameters), WKV attention with recurrent state via ggml_rwkv_wkv6, and group normalization -- implementing the RWKV6 linear attention mechanism.

Usage

Provides the shared base layer implementations used by all RWKV6-variant models, enabling Ollama to run RWKV6 recurrent models with their unique linear-attention mechanism instead of standard transformer attention.

Code Reference

Source Location

Repository: Ollama
File: llama/llama.cpp/src/models/rwkv6-base.cpp
Lines: 1-162

Signature

llm_build_rwkv6_base::llm_build_rwkv6_base(
    const llama_model & model,
    const llm_graph_params & params) : llm_graph_context(params), model(model);

ggml_tensor * llm_build_rwkv6_base::build_rwkv6_channel_mix(
    const llama_layer * layer,
    ggml_tensor * cur,
    ggml_tensor * x_prev,
    llm_arch arch) const;

ggml_tensor * llm_build_rwkv6_base::build_rwkv6_time_mix(
    llm_graph_input_rs * inp,
    ggml_tensor * cur,
    ggml_tensor * x_prev,
    const llama_ubatch & ubatch,
    int il) const;

Import

#include "models.h"

I/O Contract

Inputs

Name	Type	Required	Description
layer	const llama_layer *	Yes	Current layer with RWKV6 weights
cur	ggml_tensor *	Yes	Current hidden state
x_prev	ggml_tensor *	Yes	Previous timestep hidden state
ubatch	const llama_ubatch &	Yes	Current micro-batch

Outputs

Name	Type	Description
ggml_tensor *	ggml_tensor*	Output tensor from channel or time mixing

Usage Examples

// Used as a base class by RWKV6 model builders:
struct llm_build_rwkv6 : public llm_build_rwkv6_base {
    llm_build_rwkv6(const llama_model & model, const llm_graph_params & params);
};
// Channel mix and time mix methods are called within the model graph build

Related Pages

Principle:Ollama_Ollama_LLM_Inference_Pipeline

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment