Implementation:InternLM Lmdeploy BlockIterator
| Knowledge Sources | |
|---|---|
| Domains | GPU_Kernels, Attention |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Implements a tile-based iterator over paged KV cache blocks, providing SetTile, Advance, and typed pointer access for key, value, and quantization parameter data.
Description
BlockIterator navigates the paged KV cache by converting tile indices to (block_id, block_ti) coordinates via the underlying BlockHead. It supports reverse iteration (tiles are processed from back to front in FlashAttention). BlockIteratorFactory constructs iterators for a given (batch, head) pair, and a specialization of CreateCacheIterFactory bridges the factory to the kernel parameter struct. The convenience alias GetBlockIterFactory assembles the full factory type from template parameters.
Usage
Created by the attention kernel prologue to iterate over KV cache tiles during the mainloop. The factory is constructed on the host side and passed to the kernel.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: src/turbomind/kernels/attention/block_iterator.h
- Lines: 1-102
Signature
template<class BlockHead, int CTA_S>
struct BlockIterator {
BlockHead block_head_;
char** block_ptrs_;
__device__ BlockIterator(BlockHead block_head, char** block_ptrs);
__device__ void SetTile(int iter);
__device__ void Advance();
template<int Index>
__device__ auto OffsetPtr(int offset) const;
// Index: 0=k_data, 1=v_data, 2=k_param, 3=v_param
};
template<class T, class Tkv, class BlockLayout_, int CTA_S>
struct BlockIteratorFactory {
__device__ auto Create(int batch_idx, int head_idx);
};
template<class T, class Tkv, int CTA_S, int HeadDim>
using GetBlockIterFactory = BlockIteratorFactory<T, Tkv, block::Layout<block::Config<T, Tkv, HeadDim>>, CTA_S>;
Import
#include "src/turbomind/kernels/attention/block_iterator.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| block_head | BlockHead | Yes | Block head accessor for a specific layer/head |
| block_ptrs | char** | Yes | Per-sequence block pointer array |
| iter | int | Yes | Tile index to seek to (for SetTile) |
Outputs
| Name | Type | Description |
|---|---|---|
| OffsetPtr<0> | Tkv* | Pointer to K data at current tile + offset |
| OffsetPtr<1> | Tkv* | Pointer to V data at current tile + offset |
| OffsetPtr<2> | T* | Pointer to K quantization params at current tile |
| OffsetPtr<3> | T* | Pointer to V quantization params at current tile |
Usage Examples
auto factory = GetBlockIterFactory<half, half, 64, 128>{layout, block_ptrs, cu_block_nums, layer_id};
auto iter = factory.Create(batch_idx, kv_head_idx);
iter.SetTile(last_tile);
// In mainloop:
auto k_ptr = iter.OffsetPtr<0>(thread_offset);
iter.Advance();