Implementation:Ggml org Llama cpp Json Schema To Grammar Py
| Knowledge Sources | |
|---|---|
| Domains | JSON, Grammar |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Converts JSON Schema definitions into GBNF (GGML BNF) grammar rules that constrain llama.cpp text generation to produce valid JSON matching a given schema.
Description
The `SchemaConverter` class recursively traverses a JSON Schema, handling types (string, number, integer, boolean, null, object, array), constraints (min/max values, patterns, enums, required properties, additionalProperties), and composition keywords (anyOf, oneOf, allOf, $ref). It generates GBNF grammar rules with helper functions like `_build_repetition` for array/string length constraints and `_generate_min_max_int` for integer range constraints. A `TrieNode` class is used to optimize literal string alternatives into compact grammar rules.
Usage
Use this module when you need to constrain LLM output to valid JSON conforming to a specific JSON Schema, enabling structured data extraction, function calling responses, or any scenario requiring deterministic JSON output format from a language model.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: examples/json_schema_to_grammar.py
- Lines: 1-837
Signature
class BuiltinRule:
def __init__(self, content: str, deps: list = None): ...
class SchemaConverter:
def __init__(self, *, prop_order, allow_fetch, dotall, raw_pattern): ...
def resolve_refs(self, schema: dict, url: str) -> dict: ...
def _generate_union_rule(self, name: str, alt_schemas: list) -> str: ...
def _visit(self, schema: dict, name: str) -> str: ...
def _add_rule(self, name: str, rule: str) -> str: ...
def format_grammar(self) -> str: ...
class TrieNode:
def __init__(self): ...
def insert(self, word: str): ...
def to_regex(self) -> str: ...
def _build_repetition(item_rule, min_items, max_items, separator_rule=None): ...
def _generate_min_max_int(min_value, max_value, out, decimals_left=16, top_level=True): ...
def main(): ...
Import
from __future__ import annotations
import argparse
import itertools
import json
import re
import sys
from typing import Any, List, Optional, Set, Tuple, Union
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| schema | dict | Yes | JSON Schema object to convert to GBNF grammar |
| prop_order | dict | No | Custom ordering for object properties in the grammar |
| allow_fetch | bool | No | Whether to allow fetching remote $ref schemas |
| dotall | bool | No | Whether '.' in regex patterns should match newlines |
| raw_pattern | bool | No | Whether to treat string patterns as raw GBNF patterns |
Outputs
| Name | Type | Description |
|---|---|---|
| grammar | str | GBNF grammar string that constrains generation to valid JSON matching the input schema |
Usage Examples
# Convert a JSON Schema to GBNF grammar
from json_schema_to_grammar import SchemaConverter
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer", "minimum": 0, "maximum": 150}
},
"required": ["name", "age"]
}
converter = SchemaConverter(prop_order={}, allow_fetch=False, dotall=False, raw_pattern=False)
converter.resolve_refs(schema, "input")
converter._visit(schema, "root")
grammar = converter.format_grammar()
print(grammar)