Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Vllm project Vllm Sequence

From Leeroopedia


Knowledge Sources
Domains Pipeline_Parallel, Distributed_Inference
Last Updated 2026-02-08 00:00 GMT

Overview

Defines the IntermediateTensors data structure for passing hidden states and residuals between pipeline-parallel stages during distributed inference.

Description

This module provides the IntermediateTensors dataclass, which wraps a dictionary of named tensors (dict[str, torch.Tensor]) with a dictionary-like interface supporting key-based access, slice-based batched access, and iteration. It also carries an optional kv_connector_output for KV cache transfers between pipeline stages. The class is implemented as a dataclass (rather than msgspec.Struct) for compatibility with PyTorch Dynamo tracing, and its __init__ is explicitly defined so Dynamo can trace the source file correctly.

Usage

Use this data structure in pipeline-parallel model execution to transfer intermediate hidden states between pipeline stages. Every pipeline stage except the last returns an IntermediateTensors instance containing the hidden states and residuals needed by the next stage.

Code Reference

Source Location

Signature

@dataclass
class IntermediateTensors:
    tensors: dict[str, torch.Tensor]
    kv_connector_output: KVConnectorOutput | None

    def __init__(
        self,
        tensors: dict[str, torch.Tensor],
        kv_connector_output: KVConnectorOutput | None = None,
    ) -> None: ...

    def __getitem__(self, key: str | slice): ...
    def __setitem__(self, key: str, value: torch.Tensor): ...
    def items(self): ...
    def __len__(self) -> int: ...
    def __eq__(self, other: object) -> bool: ...
    def __repr__(self) -> str: ...

Import

from vllm.sequence import IntermediateTensors

I/O Contract

Inputs

Name Type Required Description
tensors dict[str, torch.Tensor] Yes Named tensors (e.g., "hidden_states", "residual") to pass between stages
kv_connector_output KVConnectorOutput or None No Optional KV cache connector output for cross-stage KV transfers (default None)

Outputs

Name Type Description
tensor torch.Tensor Individual tensor retrieved by string key via __getitem__
sliced IntermediateTensors New IntermediateTensors with sliced tensors via slice key
items ItemsView Key-value pairs of all stored tensors via items()

Usage Examples

from vllm.sequence import IntermediateTensors
import torch

# Create intermediate tensors to pass between pipeline stages
intermediate = IntermediateTensors(
    tensors={
        "hidden_states": torch.randn(32, 4096),
        "residual": torch.randn(32, 4096),
    }
)

# Access a specific tensor by name
hidden = intermediate["hidden_states"]  # torch.Tensor [32, 4096]

# Slice for a subset of the batch
batch_slice = intermediate[0:16]  # IntermediateTensors with tensors sliced to [16, 4096]

# Set a tensor
intermediate["hidden_states"] = new_hidden_states

# Iterate over stored tensors
for name, tensor in intermediate.items():
    print(f"{name}: {tensor.shape}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment