Implementation:InternLM Lmdeploy Response Dataclass
| Knowledge Sources | |
|---|---|
| Domains | LLM_Inference, Data_Structures |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Concrete tool for encapsulating inference results including generated text, token counts, and finish reasons provided by the LMDeploy library.
Description
The Response dataclass packages all output information from a single inference request. It provides text output, token statistics, finish reason, optional logprobs, and an extend() method for incremental streaming aggregation.
Usage
Returned by Pipeline.__call__() and Pipeline.stream_infer(). Access response.text for the generated string, response.finish_reason to check completion status, and response.generate_token_len for usage tracking.
Code Reference
Source Location
- Repository: lmdeploy
- File: lmdeploy/messages.py
- Lines: L460-547
Signature
@dataclass
class Response:
text: str # Generated text
generate_token_len: int # Output token count
input_token_len: int # Input token count
finish_reason: Optional[Literal['stop', 'length']] = None # Stop reason
token_ids: List[int] = field(default_factory=list) # Output token IDs
logprobs: List[Dict[int, float]] = None # Per-token logprobs
logits: torch.Tensor = None # Raw logits tensor
last_hidden_state: torch.Tensor = None # Hidden state
index: int = 0 # Batch position index
def extend(self, other: 'Response') -> 'Response':
"""Merge another response into this one (for streaming)."""
...
Import
from lmdeploy.messages import Response
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| text | str | Yes | Generated text content |
| generate_token_len | int | Yes | Number of tokens generated |
| input_token_len | int | Yes | Number of input tokens (includes template) |
Outputs
| Name | Type | Description |
|---|---|---|
| text | str | The generated text |
| finish_reason | 'stop' or 'length' | Why generation ended |
| generate_token_len | int | Output token count |
| input_token_len | int | Input token count |
| token_ids | List[int] | Raw output token IDs |
Usage Examples
Response Inspection
from lmdeploy import pipeline
pipe = pipeline('internlm/internlm2_5-7b-chat')
response = pipe('What is AI?')
print(f"Text: {response.text}")
print(f"Input tokens: {response.input_token_len}")
print(f"Output tokens: {response.generate_token_len}")
print(f"Finish reason: {response.finish_reason}")
print(f"Token IDs: {response.token_ids[:10]}...")
pipe.close()