Implementation:Vllm project Vllm Structured Output Parsing

Knowledge Sources	vLLM vLLM Docs
Domains	LLM Inference, Structured Output, Validation
Last Updated	2026-02-08 13:00 GMT

Overview

Concrete pattern for parsing and validating structured outputs after constrained generation, using Python standard library and Pydantic.

Description

After LLM.generate() returns a list of RequestOutput objects, the generated text is available as a raw string via output.outputs[0].text. This implementation covers the standard patterns for parsing that text into typed data structures and validating it against the original constraint.

The parsing strategy depends on the constraint type used during generation:

JSON outputs: Use json.loads() to parse into a Python dictionary, or use Pydantic's Model.model_validate_json() to parse directly into a validated Pydantic model instance.
Regex outputs: Use re.fullmatch() or re.match() to verify the output matches the expected pattern and extract named groups if applicable.
Choice outputs: Compare the output text directly against the choice list using an in check or assert statement.
Grammar outputs: Apply application-specific parsing appropriate to the grammar (e.g., an SQL parser for SQL grammar outputs).

Usage

Use this pattern as the final step in any structured output workflow. Extract the text from RequestOutput, parse it into the appropriate type, and optionally apply additional semantic validation.

Code Reference

Source Location

Repository: Python standard library
File: json module, re module, and pydantic.BaseModel.model_validate_json()

Signature

# JSON parsing (stdlib)
json.loads(s: str | bytes) -> Any

# Regex matching (stdlib)
re.fullmatch(pattern: str, string: str) -> re.Match | None
re.match(pattern: str, string: str) -> re.Match | None

# Pydantic JSON validation
BaseModel.model_validate_json(json_data: str | bytes) -> BaseModel

Import

import json
import re
from pydantic import BaseModel

I/O Contract

Inputs

Name	Type	Required	Description
output_text	`str`	Yes	The generated text from `RequestOutput.outputs[0].text`
json_schema	`dict`	No (for JSON outputs)	The original JSON Schema used during generation, for reference validation
pydantic_model	`type[BaseModel]`	No (for JSON outputs)	The Pydantic model class for type-safe parsing via `model_validate_json()`
regex_pattern	`str`	No (for regex outputs)	The regex pattern used during generation, for match verification
choice_list	`list[str]`	No (for choice outputs)	The choice list used during generation, for membership verification

Outputs

Name	Type	Description
parsed_json	BaseModel	Parsed JSON object as a dictionary or validated Pydantic model instance
regex_match	None	Match object confirming the output matches the pattern, or None on failure
choice_result	`str`	The selected choice string, verified against the original list

Usage Examples

Parsing JSON Output with json.loads

import json
from vllm import LLM, SamplingParams
from vllm.sampling_params import StructuredOutputsParams
from pydantic import BaseModel

class CarDescription(BaseModel):
    brand: str
    model: str
    car_type: str

llm = LLM(model="Qwen/Qwen2.5-3B-Instruct", max_model_len=100)
structured = StructuredOutputsParams(json=CarDescription.model_json_schema())
sampling_params = SamplingParams(structured_outputs=structured, max_tokens=50)

outputs = llm.generate("Describe the most iconic 90s car as JSON", sampling_params)
text = outputs[0].outputs[0].text

# Parse into a plain dictionary
data = json.loads(text)
print(data["brand"])   # e.g., "Toyota"
print(data["model"])   # e.g., "Supra"

Parsing JSON Output with Pydantic model_validate_json

import json
from pydantic import BaseModel

class CarDescription(BaseModel):
    brand: str
    model: str
    car_type: str

# Assuming `text` is the generated JSON string from LLM.generate()
car = CarDescription.model_validate_json(text)
print(car.brand)      # type-safe access
print(car.car_type)   # validated against model constraints

Validating Regex Output

import re

pattern = r"\w+@\w+\.com"
# Assuming `text` is the generated text from LLM.generate()
match = re.fullmatch(pattern, text.strip())
if match:
    print(f"Valid email: {match.group()}")
else:
    print("Output did not match expected pattern")

Validating Choice Output

choices = ["Positive", "Negative"]
# Assuming `text` is the generated text from LLM.generate()
assert text in choices, f"Unexpected output: {text}"
print(f"Sentiment: {text}")

Full End-to-End Example with Error Handling

import json
from pydantic import BaseModel, ValidationError
from vllm import LLM, SamplingParams
from vllm.sampling_params import StructuredOutputsParams

class Person(BaseModel):
    name: str
    age: int

llm = LLM(model="Qwen/Qwen2.5-3B-Instruct", max_model_len=200)
structured = StructuredOutputsParams(json=Person.model_json_schema())
sampling_params = SamplingParams(structured_outputs=structured, max_tokens=100)

outputs = llm.generate("Generate a JSON for a famous scientist", sampling_params)
text = outputs[0].outputs[0].text

try:
    person = Person.model_validate_json(text)
    print(f"Name: {person.name}, Age: {person.age}")
except ValidationError as e:
    print(f"Validation failed: {e}")
except json.JSONDecodeError as e:
    print(f"JSON parsing failed: {e}")

Related Pages

Implements Principle

Principle:Vllm_project_Vllm_Output_Validation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment