Implementation:Guardrails ai Guardrails Schema Generator
| Knowledge Sources | |
|---|---|
| Domains | Schema, Data Generation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
The Schema Generator module produces synthetic example data from JSON Schema definitions, using the Faker library and random value generation to create realistic sample objects.
Description
This module provides the generate_example function and a collection of supporting generators that can take any valid JSON Schema and produce a corresponding sample Python object populated with randomly generated data. The generator handles all standard JSON Schema types (string, integer, number, boolean, array, object, null) and supports advanced schema features including:
- Schema compositions:
oneOf,anyOf,allOf-- the generator randomly picks sub-schemas or merges them appropriately. - Conditional sub-schemas:
if/then/elseblocks are evaluated against generated values to determine which properties to include. - Enums and constants:
enumvalues are randomly selected;constvalues are returned directly. - String formats: Recognizes standard formats (date, date-time, time, email, url/uri, percentage) and naming conventions (snake_case, camelCase, Title Case) to produce appropriately formatted strings.
- String patterns: Uses the
rstrlibrary to generate strings matching a given regex pattern. - Numeric constraints: Respects
minimum,maximum,exclusiveMinimum,exclusiveMaximum, andmultipleOf. - Array constraints: Respects
minItems,maxItem, anduniqueItems. - JSON references: Uses
jsonrefto dereference$refpointers before generation.
The module leverages the Faker library extensively, and will attempt to match property names to Faker providers for contextually relevant data (e.g., a property named email will generate a realistic email address).
Usage
Use generate_example when you need to create sample data for testing, documentation, prompt engineering, or schema validation. It is particularly useful for generating example payloads that conform to a Guard's output schema.
Code Reference
Source Location
- Repository: Guardrails
- File:
guardrails/schema/generator.py - Lines: 1-351
Signature
def generate_example(
json_schema: Dict[str, Any], *, property_name: Optional[str] = None
) -> Any: ...
def gen_num(schema: Dict[str, Any]) -> Union[int, float]: ...
def gen_string(schema: Dict[str, Any], *, property_name: Optional[str] = None) -> str: ...
def gen_array(schema: Dict[str, Any], *, property_name: Optional[str] = None) -> List[Any]: ...
def gen_object(schema: Dict[str, Any]) -> Dict[str, Any]: ...
def gen_formatted_string(format: str, default: str) -> str: ...
def gen_from_type(schema: Dict[str, Any], *, property_name: Optional[str] = None) -> Any: ...
def gen_from_enum(enum: List[Any]) -> Any: ...
def evaluate_if_block(schema: Dict[str, Any], value: Any) -> Any: ...
def pick_sub_schema(
schema: Dict[str, Any], sub_schema_key: str, *, property_name: Optional[str] = None
) -> Any: ...
def evaluate_all_of(
schema: Dict[str, Any], value: Any, *, property_name: Optional[str] = None
) -> Any: ...
Import
from guardrails.schema.generator import generate_example
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| json_schema | Dict[str, Any] |
Yes | A valid JSON Schema dictionary. May contain $ref pointers which will be dereferenced automatically.
|
| property_name | Optional[str] |
No | An optional hint used to select a contextually relevant Faker provider (e.g., "email", "address").
|
Outputs
| Name | Type | Description |
|---|---|---|
| (return) | Any |
A randomly generated Python object conforming to the input JSON Schema. The concrete type depends on the schema (dict, list, str, int, float, bool, or None). |
Usage Examples
from guardrails.schema.generator import generate_example
# Generate from a simple object schema
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer", "minimum": 18, "maximum": 99},
"email": {"type": "string", "format": "email"},
"tags": {
"type": "array",
"items": {"type": "string"},
"minItems": 1,
"maxItem": 3,
},
},
}
example = generate_example(schema)
# Example output: {
# "name": "technology",
# "age": 42,
# "email": "jane.doe@example.com",
# "tags": ["movement", "peace"]
# }
# Generate from a schema with enums
enum_schema = {
"type": "string",
"enum": ["red", "green", "blue"],
}
color = generate_example(enum_schema)
# Returns one of: "red", "green", or "blue"