Implementation:LLMBook zh LLMBook zh github io Apply Rotary Pos Emb

Knowledge Sources	LLMBook-zh RoFormer: Enhanced Transformer with Rotary Position Embedding
Domains	Deep_Learning, Model_Architecture
Last Updated	2026-02-08 04:29 GMT

Overview

Concrete tool for applying Rotary Position Embeddings to query and key tensors provided as standalone functions.

Description

This implementation provides two functions: rotate_half and apply_rotary_pos_emb. The `rotate_half` helper splits a vector into two halves and swaps them with a sign flip, implementing the rotation operation. The `apply_rotary_pos_emb` function takes precomputed cosine and sine values indexed by position and applies them to the query and key tensors using the RoPE formula. These functions are used inside the attention mechanism of LLaMA-style models to inject position information into the attention computation.

Usage

Import these functions when implementing or studying the attention mechanism of LLaMA-family models. They are called after the Q/K projections and before the attention score computation. The cos/sin values are precomputed based on the maximum sequence length and hidden dimension.

Code Reference

Source Location

Repository: LLMBook-zh
File: code/5.2 RoPE.py
Lines: 1-14

Signature

def rotate_half(x):
    """
    Splits input tensor into two halves along last dimension,
    swaps them with negation on the first half.

    Args:
        x: Input tensor of shape (..., d).
    Returns:
        Rotated tensor of shape (..., d) where [-x2, x1] replaces [x1, x2].
    """

def apply_rotary_pos_emb(q, k, cos, sin, position_ids):
    """
    Applies rotary position embeddings to query and key tensors.

    Args:
        q: Query tensor of shape (batch, heads, seq_len, head_dim).
        k: Key tensor of shape (batch, heads, seq_len, head_dim).
        cos: Cosine values of shape (max_seq_len, head_dim).
        sin: Sine values of shape (max_seq_len, head_dim).
        position_ids: Position indices of shape (batch, seq_len).
    Returns:
        Tuple of (q_embed, k_embed) with position information encoded.
    """

Import

import torch
# Functions defined locally in code/5.2 RoPE.py

I/O Contract

Inputs

Name	Type	Required	Description
q	torch.Tensor	Yes	Query tensor (batch, heads, seq_len, head_dim)
k	torch.Tensor	Yes	Key tensor (batch, heads, seq_len, head_dim)
cos	torch.Tensor	Yes	Precomputed cosine values (max_seq_len, head_dim)
sin	torch.Tensor	Yes	Precomputed sine values (max_seq_len, head_dim)
position_ids	torch.Tensor	Yes	Position indices (batch, seq_len)

Outputs

Name	Type	Description
q_embed	torch.Tensor	Query with rotary position encoding applied
k_embed	torch.Tensor	Key with rotary position encoding applied

Usage Examples

import torch

# Example: apply RoPE to query and key projections
batch, heads, seq_len, head_dim = 2, 32, 128, 64
q = torch.randn(batch, heads, seq_len, head_dim)
k = torch.randn(batch, heads, seq_len, head_dim)

# Precomputed cos/sin from frequency table
cos = torch.randn(512, head_dim)  # max_seq_len=512
sin = torch.randn(512, head_dim)
position_ids = torch.arange(seq_len).unsqueeze(0).expand(batch, -1)

q_embed, k_embed = apply_rotary_pos_emb(q, k, cos, sin, position_ids)
# q_embed.shape == (2, 32, 128, 64)

Related Pages

Environment:LLMBook_zh_LLMBook_zh_github_io_PyTorch_CUDA_GPU_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment