Implementation:Ollama Ollama Llama Grammar Types
| Knowledge Sources | |
|---|---|
| Domains | Grammar, Constrained Generation |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Header declaring the grammar engine types: grammar element types, grammar state structures, GBNF parser, the Ollama-specific vocabulary wrapper, and the internal grammar API.
Description
Defines the llama_gretype enum for grammar element types (end, alt, rule ref, char, char range, char any, token). Declares ollama_vocab struct with token-to-piece mapping and EOG token tracking for Ollama's custom grammar support. Defines llama_grammar_element for individual grammar rules, llama_partial_utf8 for partial UTF-8 state, llama_grammar_candidate for token candidates during filtering, and llama_grammar_parser for parsing GBNF text into rule structures. The llama_grammar struct holds grammar stacks, rules, vocabulary reference, lazy evaluation state, and trigger patterns for conditional grammar activation.
Usage
Include this header when implementing grammar-constrained decoding or working with the grammar engine internals. The ollama_vocab type is an Ollama-specific addition for grammar operations without a full model context.
Code Reference
Source Location
- Repository: Ollama
- File: llama/llama.cpp/src/llama-grammar.h
- Lines: 1-206
Signature
struct ollama_vocab {
std::map<uint32_t, std::string> token_to_piece_map;
std::set<uint32_t> special_eog_ids;
const std::string & token_to_piece(const uint32_t token) const;
bool is_eog(const uint32_t token) const;
};
enum llama_gretype {
LLAMA_GRETYPE_END, LLAMA_GRETYPE_ALT, LLAMA_GRETYPE_RULE_REF,
LLAMA_GRETYPE_CHAR, LLAMA_GRETYPE_CHAR_NOT, LLAMA_GRETYPE_CHAR_RNG_UPPER,
LLAMA_GRETYPE_CHAR_ALT, LLAMA_GRETYPE_CHAR_ANY,
LLAMA_GRETYPE_TOKEN, LLAMA_GRETYPE_TOKEN_NOT,
};
typedef struct llama_grammar_element {
enum llama_gretype type;
uint32_t value;
} llama_grammar_element;
struct llama_grammar_parser {
const llama_vocab * vocab;
std::map<std::string, uint32_t> symbol_ids;
llama_grammar_rules rules;
bool parse(const char * src);
};
struct llama_grammar {
const llama_vocab * vocab;
const ollama_vocab * o_vocab;
const llama_grammar_rules rules;
llama_grammar_stacks stacks;
llama_partial_utf8 partial_utf8;
bool lazy = false;
bool awaiting_trigger = false;
std::string trigger_buffer;
std::vector<llama_token> trigger_tokens;
std::vector<llama_grammar_trigger_pattern> trigger_patterns;
};
Import
#include "llama-grammar.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| src | const char * | Yes | GBNF grammar source text to parse |
| vocab | const llama_vocab * | No | Vocabulary for token-level grammar operations |
| o_vocab | const ollama_vocab * | No | Ollama vocabulary wrapper for custom grammar support |
Outputs
| Name | Type | Description |
|---|---|---|
| rules | llama_grammar_rules | Parsed grammar rules |
| stacks | llama_grammar_stacks | Grammar state stacks for tracking position |
| success | bool | Whether parsing succeeded |
Usage Examples
#include "llama-grammar.h"
// Parse a grammar
llama_grammar_parser parser(vocab);
bool ok = parser.parse("root ::= \"hello\" | \"world\"");
// Access parsed rules
const auto & rules = parser.rules;
// Ollama vocab for grammar without full model
ollama_vocab o_vocab;
o_vocab.add_token_pieces(token_ids, n_tokens, pieces);
o_vocab.set_eog_tokens(eog_ids, n_eog);
// Check grammar element types
llama_grammar_element elem;
if (elem.type == LLAMA_GRETYPE_CHAR) {
// Character match element
}