Implementation:Ggml org Llama cpp Pydantic Models To Grammar
| Knowledge Sources | |
|---|---|
| Domains | Grammar, Structured_Output |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Converts Pydantic model definitions into GBNF (Generalized Backus-Naur Form) grammars for constraining LLM output to valid structured formats.
Description
Maps Pydantic types (string, int, float, bool, enums, lists, unions, optionals, nested models, dicts) to GBNF rules via the `PydanticDataType` enum and recursive type introspection. Also generates human-readable documentation from model definitions for use in system prompts. Supports custom features like triple-quoted strings, markdown code blocks, regex patterns, and integer/float precision constraints via `json_schema_extra`. Can create dynamic Pydantic models from functions or OpenAI-style dictionary definitions.
Usage
Use this module when you need to constrain LLM generation to valid JSON matching specified Pydantic schemas, enabling structured output extraction and function calling by bridging typed Python data models with GBNF grammar constraints.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: examples/pydantic_models_to_grammar.py
- Lines: 1-1322
Signature
class PydanticDataType(Enum):
STRING = "string"
TRIPLE_QUOTED_STRING = "triple_quoted_string"
MARKDOWN_CODE_BLOCK = "markdown_code_block"
BOOLEAN = "boolean"
INTEGER = "integer"
FLOAT = "float"
OBJECT = "object"
ARRAY = "array"
ENUM = "enum"
ANY = "any"
NULL = "null"
CUSTOM_CLASS = "custom-class"
CUSTOM_DICT = "custom-dict"
SET = "set"
def map_pydantic_type_to_gbnf(pydantic_type: type[Any]) -> str: ...
def generate_gbnf_grammar_and_documentation(
pydantic_model_list, outer_object_name, outer_object_content,
model_prefix, fields_prefix, list_of_outputs=False
) -> tuple[str, str]: ...
def create_dynamic_model_from_function(func: Callable) -> type[BaseModel]: ...
def convert_dictionary_to_pydantic_model(dictionary: dict, model_name: str) -> type[BaseModel]: ...
def add_run_method_to_dynamic_model(model: type[BaseModel], func: Callable) -> type[BaseModel]: ...
Import
from __future__ import annotations
import inspect
import json
import re
from copy import copy
from enum import Enum
from typing import Any, Callable, List, Optional, Union, get_args, get_origin, get_type_hints
from docstring_parser import parse
from pydantic import BaseModel, create_model
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| pydantic_model_list | list[type[BaseModel]] | Yes | List of Pydantic model classes to convert to GBNF grammar |
| outer_object_name | str | Yes | Name for the outer wrapper object in the grammar |
| outer_object_content | str | Yes | Name for the content field within the outer wrapper |
| model_prefix | str | No | Prefix for model names in documentation |
| fields_prefix | str | No | Prefix for field names in documentation |
| list_of_outputs | bool | No | Whether to generate a grammar for a list of output objects |
Outputs
| Name | Type | Description |
|---|---|---|
| grammar | str | GBNF grammar string constraining generation to valid JSON matching the Pydantic models |
| documentation | str | Human-readable documentation string describing the models for use in system prompts |
Usage Examples
from pydantic import BaseModel, Field
from pydantic_models_to_grammar import generate_gbnf_grammar_and_documentation
class WeatherQuery(BaseModel):
"""Query for weather information."""
city: str = Field(..., description="City name")
unit: str = Field("celsius", description="Temperature unit")
grammar, docs = generate_gbnf_grammar_and_documentation(
pydantic_model_list=[WeatherQuery],
outer_object_name="function",
outer_object_content="function_parameters",
model_prefix="Function",
fields_prefix="Parameters"
)
# Use grammar with llama-server /completion endpoint
# The grammar ensures output is valid JSON matching the WeatherQuery schema