Implementation:LMCache LMCache SegmentTokenDatabase Process Tokens

Knowledge Sources	LMCache
Domains	Caching, NLP
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for splitting token sequences on separator strings and generating segment-independent cache keys, provided by the SegmentTokenDatabase class.

Description

The SegmentTokenDatabase.process_tokens method splits the input token sequence by scanning for the separator token pattern (blend_special_str tokenized), yields (start, end, key) tuples for each segment, where the key is a content-addressable hash of the segment tokens only (independent of position). The _fast_split_by_subtensor helper performs efficient tensor scanning.

Usage

Called internally by LMCacheEngine.store and retrieve when blending is enabled. The SegmentTokenDatabase is selected automatically by LMCacheEngineBuilder._Create_token_database when config.enable_blending=True.

Code Reference

Source Location

Repository: LMCache
File: lmcache/v1/token_database.py
Lines: L399-L521

Signature

class SegmentTokenDatabase(TokenDatabase):
    def process_tokens(
        self,
        tokens: Optional[Union[torch.Tensor, list[int]]] = None,
        hashes: Optional[list[int]] = None,
        offsets: Optional[list[int]] = None,
        mask: Optional[torch.Tensor] = None,
        make_key: bool = True,
        request_configs: Optional[dict] = None,
    ) -> Iterable[ProcessTokensResult]:
        """Split tokens on separator and yield per-segment (start, end, key).

        Args:
            tokens: Input token IDs containing separator tokens
            hashes: Pre-computed hashes (alternative to tokens)
            offsets: Chunk offsets (with hashes)
            mask: Boolean storage/retrieval mask
            make_key: Whether to generate CacheEngineKey (True) or raw hash
            request_configs: Optional per-request config overrides
        """

Import

from lmcache.v1.token_database import SegmentTokenDatabase

I/O Contract

Inputs

Name	Type	Required	Description
tokens	Optional[Union[torch.Tensor, list[int]]]	No*	Token IDs with separator tokens (* either tokens or hashes)
mask	Optional[torch.Tensor]	No	Boolean mask for selective processing

Outputs

Name	Type	Description
yields	Iterable[tuple[int, int, CacheEngineKey]]	(start, end, key) tuples per segment

Related Pages

Implements Principle

Principle:LMCache_LMCache_Segment_Based_KV_Caching

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment