Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:InternLM Lmdeploy BlockIterator

From Leeroopedia


Knowledge Sources
Domains GPU_Kernels, Attention
Last Updated 2026-02-07 15:00 GMT

Overview

Implements a tile-based iterator over paged KV cache blocks, providing SetTile, Advance, and typed pointer access for key, value, and quantization parameter data.

Description

BlockIterator navigates the paged KV cache by converting tile indices to (block_id, block_ti) coordinates via the underlying BlockHead. It supports reverse iteration (tiles are processed from back to front in FlashAttention). BlockIteratorFactory constructs iterators for a given (batch, head) pair, and a specialization of CreateCacheIterFactory bridges the factory to the kernel parameter struct. The convenience alias GetBlockIterFactory assembles the full factory type from template parameters.

Usage

Created by the attention kernel prologue to iterate over KV cache tiles during the mainloop. The factory is constructed on the host side and passed to the kernel.

Code Reference

Source Location

Signature

template<class BlockHead, int CTA_S>
struct BlockIterator {
    BlockHead block_head_;
    char**    block_ptrs_;

    __device__ BlockIterator(BlockHead block_head, char** block_ptrs);
    __device__ void SetTile(int iter);
    __device__ void Advance();

    template<int Index>
    __device__ auto OffsetPtr(int offset) const;
    // Index: 0=k_data, 1=v_data, 2=k_param, 3=v_param
};

template<class T, class Tkv, class BlockLayout_, int CTA_S>
struct BlockIteratorFactory {
    __device__ auto Create(int batch_idx, int head_idx);
};

template<class T, class Tkv, int CTA_S, int HeadDim>
using GetBlockIterFactory = BlockIteratorFactory<T, Tkv, block::Layout<block::Config<T, Tkv, HeadDim>>, CTA_S>;

Import

#include "src/turbomind/kernels/attention/block_iterator.h"

I/O Contract

Inputs

Name Type Required Description
block_head BlockHead Yes Block head accessor for a specific layer/head
block_ptrs char** Yes Per-sequence block pointer array
iter int Yes Tile index to seek to (for SetTile)

Outputs

Name Type Description
OffsetPtr<0> Tkv* Pointer to K data at current tile + offset
OffsetPtr<1> Tkv* Pointer to V data at current tile + offset
OffsetPtr<2> T* Pointer to K quantization params at current tile
OffsetPtr<3> T* Pointer to V quantization params at current tile

Usage Examples

auto factory = GetBlockIterFactory<half, half, 64, 128>{layout, block_ptrs, cu_block_nums, layer_id};
auto iter = factory.Create(batch_idx, kv_head_idx);
iter.SetTile(last_tile);
// In mainloop:
auto k_ptr = iter.OffsetPtr<0>(thread_offset);
iter.Advance();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment