Implementation:Vllm project Vllm Pydantic Schema Generation
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Structured Output, Schema Definition |
| Last Updated | 2026-02-08 13:00 GMT |
Overview
Concrete pattern for defining output schemas (JSON Schema, regex, grammar, choice list) before constrained generation, provided by Pydantic and Python standard types.
Description
Before running constrained generation in vLLM, the user must define the desired output format. This implementation covers the four supported schema types and how to produce them:
- JSON Schema via Pydantic: Define a Pydantic
BaseModelsubclass with typed fields, then callmodel_json_schema()to produce a JSON Schema dictionary. Enums can be used for constrained field values. - Regex: Supply a Python regex string that the entire output must match.
- GBNF Grammar: Supply a string in GBNF (GGML BNF) notation defining a context-free grammar.
- Choice List: Supply a
list[str]of allowed output values.
Each schema type is passed into StructuredOutputsParams as the corresponding keyword argument (json, regex, grammar, or choice).
Usage
Use this pattern at the beginning of any structured output workflow. Define the schema in user code before constructing StructuredOutputsParams and SamplingParams.
Code Reference
Source Location
- Repository: vllm
- File:
examples/offline_inference/structured_outputs.py
Signature
# JSON Schema via Pydantic
pydantic.BaseModel.model_json_schema() -> dict
# Regex pattern
pattern: str # e.g., r"\w+@\w+\.com\n"
# GBNF grammar
grammar: str # e.g., 'root ::= select_statement\n...'
# Choice list
choices: list[str] # e.g., ["Positive", "Negative"]
Import
from pydantic import BaseModel
from enum import Enum
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| Pydantic model class | type[BaseModel] |
No (use one schema type) | A Pydantic model whose model_json_schema() method produces the JSON Schema dict
|
| regex | str |
No (use one schema type) | A regular expression pattern the output must match |
| grammar | str |
No (use one schema type) | A GBNF grammar string defining the output language |
| choice | list[str] |
No (use one schema type) | A list of allowed output strings |
Outputs
| Name | Type | Description |
|---|---|---|
| json_schema | dict |
JSON Schema dictionary produced by model_json_schema(), ready to pass to StructuredOutputsParams(json=...)
|
| regex_pattern | str |
Regex string ready to pass to StructuredOutputsParams(regex=...)
|
| grammar_string | str |
GBNF grammar string ready to pass to StructuredOutputsParams(grammar=...)
|
| choice_list | list[str] |
List of strings ready to pass to StructuredOutputsParams(choice=...)
|
Usage Examples
JSON Schema via Pydantic Model
from enum import Enum
from pydantic import BaseModel
class CarType(str, Enum):
sedan = "sedan"
suv = "SUV"
truck = "Truck"
coupe = "Coupe"
class CarDescription(BaseModel):
brand: str
model: str
car_type: CarType
json_schema = CarDescription.model_json_schema()
# json_schema is a dict like:
# {'properties': {'brand': {'type': 'string'}, ...}, 'required': [...], ...}
Regex Pattern
email_pattern = r"\w+@\w+\.com\n"
# Pass to StructuredOutputsParams(regex=email_pattern)
GBNF Grammar
simplified_sql_grammar = """
root ::= select_statement
select_statement ::= "SELECT " column " from " table " where " condition
column ::= "col_1 " | "col_2 "
table ::= "table_1 " | "table_2 "
condition ::= column "= " number
number ::= "1 " | "2 "
"""
# Pass to StructuredOutputsParams(grammar=simplified_sql_grammar)
Choice List
choices = ["Positive", "Negative"]
# Pass to StructuredOutputsParams(choice=choices)