Implementation:Sgl project Sglang Generation Output Dict
| Knowledge Sources | |
|---|---|
| Domains | LLM_Serving, Data_Processing |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete pattern for extracting generated text and metadata from SGLang Engine output dictionaries.
Description
The output from Engine.generate is a Python dict (single prompt) or list of dicts (batch). The primary field is "text" containing the generated completion. Additional metadata includes token counts and finish reasons. This is a pattern (not a class) — users access fields via standard dict key indexing.
Usage
Access the "text" key from the output dict to retrieve generated content. Use "meta_info" for debugging or monitoring token usage and finish reasons.
Code Reference
Source Location
- Repository: sglang
- File: python/sglang/srt/managers/io_struct.py (output format definition)
Interface Specification
# Output dict structure from Engine.generate
result: Dict = {
"text": str, # Generated text
"meta_info": {
"finish_reason": {
"type": str, # "stop" or "length"
"matched": Optional[int],
},
"completion_tokens": int,
"prompt_tokens": int,
},
"input_token_num": int,
"output_token_num": int,
}
# Access pattern
generated_text = result["text"]
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| result | Dict | Yes | Output dict from Engine.generate |
Outputs
| Name | Type | Description |
|---|---|---|
| text | str | Generated text completion |
| meta_info | Dict | Metadata (finish_reason, token counts) |
| input_token_num | int | Number of input tokens processed |
| output_token_num | int | Number of output tokens generated |
Usage Examples
Basic Text Extraction
output = engine.generate("What is 2+2?", {"temperature": 0, "max_new_tokens": 16})
# Extract generated text
answer = output["text"]
print(f"Answer: {answer}")
# Check token usage
print(f"Input tokens: {output['input_token_num']}")
print(f"Output tokens: {output['output_token_num']}")
Batch Output Processing
prompts = ["Hello", "World", "Test"]
outputs = engine.generate(prompts, {"max_new_tokens": 64})
for i, out in enumerate(outputs):
print(f"Prompt {i}: {out['text'][:50]}...")
print(f" Finish reason: {out['meta_info']['finish_reason']['type']}")