Implementation:Sgl project Sglang Generation Output Dict

Knowledge Sources	SGLang
Domains	LLM_Serving, Data_Processing
Last Updated	2026-02-10 00:00 GMT

Overview

Concrete pattern for extracting generated text and metadata from SGLang Engine output dictionaries.

Description

The output from Engine.generate is a Python dict (single prompt) or list of dicts (batch). The primary field is "text" containing the generated completion. Additional metadata includes token counts and finish reasons. This is a pattern (not a class) — users access fields via standard dict key indexing.

Usage

Access the "text" key from the output dict to retrieve generated content. Use "meta_info" for debugging or monitoring token usage and finish reasons.

Code Reference

Source Location

Repository: sglang
File: python/sglang/srt/managers/io_struct.py (output format definition)

Interface Specification

# Output dict structure from Engine.generate
result: Dict = {
    "text": str,                    # Generated text
    "meta_info": {
        "finish_reason": {
            "type": str,            # "stop" or "length"
            "matched": Optional[int],
        },
        "completion_tokens": int,
        "prompt_tokens": int,
    },
    "input_token_num": int,
    "output_token_num": int,
}

# Access pattern
generated_text = result["text"]

I/O Contract

Inputs

Name	Type	Required	Description
result	Dict	Yes	Output dict from Engine.generate

Outputs

Name	Type	Description
text	str	Generated text completion
meta_info	Dict	Metadata (finish_reason, token counts)
input_token_num	int	Number of input tokens processed
output_token_num	int	Number of output tokens generated

Usage Examples

Basic Text Extraction

output = engine.generate("What is 2+2?", {"temperature": 0, "max_new_tokens": 16})

# Extract generated text
answer = output["text"]
print(f"Answer: {answer}")

# Check token usage
print(f"Input tokens: {output['input_token_num']}")
print(f"Output tokens: {output['output_token_num']}")

Batch Output Processing

prompts = ["Hello", "World", "Test"]
outputs = engine.generate(prompts, {"max_new_tokens": 64})

for i, out in enumerate(outputs):
    print(f"Prompt {i}: {out['text'][:50]}...")
    print(f"  Finish reason: {out['meta_info']['finish_reason']['type']}")

Related Pages

Implements Principle

Principle:Sgl_project_Sglang_Generation_Output_Processing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment