Implementation:Ollama Ollama Llama Model RWKV6
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Model Architecture |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the base class for RWKV6-family model graph builders, providing the shared channel mix and time mix layer implementations.
Description
The llm_build_rwkv6_base class provides two key methods: build_rwkv6_channel_mix constructs the channel mixing layer using learned interpolation (lerp) between current and previous tokens, with receptance-gated output and squared ReLU activation. build_rwkv6_time_mix constructs the more complex time mixing layer, which uses a 5-way learned interpolation (for w, k, v, r, g parameters), WKV attention with recurrent state via ggml_rwkv_wkv6, and group normalization -- implementing the RWKV6 linear attention mechanism.
Usage
Provides the shared base layer implementations used by all RWKV6-variant models, enabling Ollama to run RWKV6 recurrent models with their unique linear-attention mechanism instead of standard transformer attention.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/models/rwkv6-base.cpp - Lines: 1-162
Signature
llm_build_rwkv6_base::llm_build_rwkv6_base(
const llama_model & model,
const llm_graph_params & params) : llm_graph_context(params), model(model);
ggml_tensor * llm_build_rwkv6_base::build_rwkv6_channel_mix(
const llama_layer * layer,
ggml_tensor * cur,
ggml_tensor * x_prev,
llm_arch arch) const;
ggml_tensor * llm_build_rwkv6_base::build_rwkv6_time_mix(
llm_graph_input_rs * inp,
ggml_tensor * cur,
ggml_tensor * x_prev,
const llama_ubatch & ubatch,
int il) const;
Import
#include "models.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| layer | const llama_layer * | Yes | Current layer with RWKV6 weights |
| cur | ggml_tensor * | Yes | Current hidden state |
| x_prev | ggml_tensor * | Yes | Previous timestep hidden state |
| ubatch | const llama_ubatch & | Yes | Current micro-batch |
Outputs
| Name | Type | Description |
|---|---|---|
| ggml_tensor * | ggml_tensor* | Output tensor from channel or time mixing |
Usage Examples
// Used as a base class by RWKV6 model builders:
struct llm_build_rwkv6 : public llm_build_rwkv6_base {
llm_build_rwkv6(const llama_model & model, const llm_graph_params & params);
};
// Channel mix and time mix methods are called within the model graph build