Principle:Guardrails ai Guardrails Pydantic Schema Definition
| Knowledge Sources | |
|---|---|
| Domains | Schema_Definition, Structured_Output |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
A schema definition principle that uses Pydantic models to simultaneously define output structure and embed per-field validation rules for LLM-generated data.
Description
Pydantic Schema Definition is the technique of defining a Pydantic BaseModel where each field specifies its type, constraints, and an optional list of Guardrails validators through the json_schema_extra metadata. This dual-purpose schema serves both as a JSON Schema definition for guiding LLM output structure and as a validation specification that Guardrails extracts at runtime to validate each field independently.
This approach leverages Pydantic's type system and metadata capabilities to create a single source of truth for both output shape and quality requirements, avoiding the need to maintain separate schema and validation definitions.
Usage
Apply this pattern when you need LLMs to produce structured JSON output that must conform to both a specific shape (fields, types) and quality constraints (length, format, content safety). Define a Pydantic model with Field(json_schema_extra={"validators": [...]}) for each field that needs validation.
Theoretical Basis
The pattern operates in two phases:
- Definition Phase: User defines a Pydantic BaseModel with type annotations and validator metadata in field json_schema_extra
- Extraction Phase: Guardrails traverses the model's fields, extracts validator instances from json_schema_extra["validators"], builds a validator map keyed by JSON path (e.g., $.name, $.address.city), and generates a JSON Schema for LLM prompting
This creates a mapping from each field path to its validators, enabling field-level validation of structured output.