Implementation:Ggml org Ggml Hexagon rope ops
| File Name | src/ggml-hexagon/htp/rope-ops.c
|
| Repository | ggml-org/ggml |
| Lines | 480 |
| Language | C |
| Domain Tags | ML_Infrastructure, DSP_Computing, Position_Encoding |
| Status | Active |
| Last Updated | 2025-05-15 12:00 GMT |
| Knowledge Sources | ggml-org/ggml repository |
Overview
rope-ops.c is the DSP-side implementation of Rotary Position Embedding (RoPE) for the Hexagon HVX vector processor, supporting both normal and NeoX rope modes. RoPE is essential for position encoding in transformer models (LLaMA, Mistral, etc.), and this implementation supports extended context via YaRN.
Description
The file defines HTP_ROPE_TYPE_NORMAL (0) and HTP_ROPE_TYPE_NEOX (2) since ggml.h cannot be included in DSP builds. It uses a rope_th_ctx struct holding:
- Frequency parameters --
freq_base,freq_scale,theta_scale - YaRN parameters --
ext_factor,attn_factor,beta_fast,beta_slow,corr_dims - Configuration --
n_dims,mode,n_ctx_orig,sections[4]
The rope_cache_init function implements the YaRN (Yet another RoPE extensioN) algorithm for position-dependent frequency scaling, based on the reference implementation. The rope_yarn_ramp function computes interpolation weights between extrapolated and interpolated frequencies for the correction dimension range.
Usage
Dispatched from the DSP-side message loop for GGML_OP_ROPE operations.
Code Reference
Source Location
| Repository | File | Lines |
|---|---|---|
| ggml-org/ggml | src/ggml-hexagon/htp/rope-ops.c |
480 |
Key Signatures
#define HTP_ROPE_TYPE_NORMAL 0
#define HTP_ROPE_TYPE_NEOX 2
struct rope_th_ctx {
int32_t n_dims;
int32_t mode;
int32_t n_ctx_orig;
int32_t sections[4];
float freq_base, freq_scale, ext_factor, attn_factor;
float beta_fast, beta_slow, theta_scale;
float corr_dims[2];
struct htp_ops_context * octx;
};
static float rope_yarn_ramp(const float low, const float high, const int i0);
static void rope_cache_init(const float theta_base, const float freq_scale,
const float * freq_factors, float * corr_dims, const uint32_t ne0,
const float ext_factor, const float mscale, float * cache, const float theta_scale);
I/O Contract
Inputs
- src0 -- Input tensor to apply position encoding to
- Position indices -- Current sequence positions
- Frequency parameters -- Base frequency, scale, and optional frequency factors
- YaRN configuration -- Extension factor, correction dimensions
Outputs
- dst -- Tensor with rotary position encoding applied
Usage Examples
RoPE cache initialization (YaRN algorithm):
// Initialize cos/sin cache for rotary embeddings
rope_cache_init(theta_base, freq_scale, freq_factors, corr_dims,
ne0, ext_factor, mscale, cache, theta_scale);
// cache[i+0] = cos(theta_final) * mscale
// cache[i+1] = sin(theta_final) * mscale
Related Pages
Implements Principle
Related Implementations
- Implementation:Ggml_org_Ggml_Hexagon_htp_main -- Message dispatcher
- Implementation:Ggml_org_Ggml_Hexagon_flash_attn -- Flash attention (uses position info)
- Implementation:Ggml_org_Ggml_Hexagon_backend -- Host-side backend