Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:LLMBook zh LLMBook zh github io Rotary Position Embedding

From Leeroopedia


Knowledge Sources
Domains Deep_Learning, Model_Architecture
Last Updated 2026-02-08 04:29 GMT

Overview

Position encoding mechanism that injects positional information by rotating query and key vectors in 2D subspaces, enabling relative position awareness through dot-product geometry.

Description

Rotary Position Embedding (RoPE) encodes absolute position information into query and key vectors by applying rotation matrices. Each pair of adjacent dimensions in the embedding is treated as a 2D subspace and rotated by an angle proportional to the position index. The key property is that the dot product between two rotated vectors depends only on their relative distance, naturally encoding relative position information through absolute position encoding. RoPE is the position encoding method used in LLaMA and most modern LLMs, replacing learned absolute embeddings and sinusoidal encodings.

Usage

Use this principle when designing or understanding position-aware attention mechanisms in Transformer models. RoPE is applied to the query and key projections before computing attention scores. It is the standard position encoding for LLaMA, Mistral, Qwen, and other modern decoder-only architectures.

Theoretical Basis

RoPE applies a rotation to each 2D subspace of the query and key vectors:

RoPE(xm,θi)=(xm(2i)cos(mθi)xm(2i+1)sin(mθi)xm(2i)sin(mθi)+xm(2i+1)cos(mθi))

Where:

  • m is the position index
  • θi=100002i/d is the rotation frequency for the i-th subspace
  • The dot product RoPE(qm),RoPE(kn) depends only on mn

Pseudo-code Logic:

# Abstract algorithm description (NOT real implementation)
x1, x2 = x[..., :d//2], x[..., d//2:]
rotated = concat(-x2, x1)  # rotate_half
q_embed = q * cos + rotate_half(q) * sin
k_embed = k * cos + rotate_half(k) * sin

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment