Implementation:Ggml org Llama cpp Jinja Lexer Header
| Knowledge Sources | |
|---|---|
| Domains | Template_Engine, Parsing |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Declares the Jinja lexer types including token definitions, the lexer struct, and error handling for tokenizing Jinja template source strings.
Description
This header defines the `token` struct with a comprehensive `type` enum covering all Jinja syntax elements: text, literals, identifiers, operators, brackets, statement/expression delimiters, and comments. The `lexer` struct provides character classification helpers (`is_word`, `is_integer`), escape character mappings, an ordered mapping table for multi-character operators, and the `tokenize()` method. The `lexer_result` struct bundles the token vector with the source string, and `lexer_exception` extends `std::runtime_error` with source position context via `fmt_error_with_source`.
Usage
Include this header when implementing or extending the Jinja template parser. The lexer is the first stage of template processing, converting raw template source into a token stream consumed by the parser to build the AST.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: common/jinja/lexer.h
- Lines: 1-157
Signature
namespace jinja {
struct token {
enum type {
eof, text, numeric_literal, string_literal, identifier,
equals, open_paren, close_paren, open_statement, close_statement,
open_expression, close_expression, open_square_bracket, close_square_bracket,
open_curly_bracket, close_curly_bracket, comma, dot, colon, pipe,
call_operator, additive_binary_operator, multiplicative_binary_operator,
comparison_binary_operator, unary_operator, comment,
};
type t;
std::string value;
size_t pos;
};
struct lexer_result {
std::vector<token> tokens;
std::string source;
};
struct lexer {
static bool is_word(char c);
static bool is_integer(char c);
lexer_result tokenize(const std::string & source);
};
struct lexer_exception : public std::runtime_error {
lexer_exception(const std::string & msg, const std::string & source, size_t pos);
};
} // namespace jinja
Import
#include "jinja/lexer.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| source | const std::string & | Yes | Raw Jinja template source string to tokenize |
Outputs
| Name | Type | Description |
|---|---|---|
| tokenize return | lexer_result | Contains the vector of tokens and the original source string |
| lexer_exception | exception | Thrown on lexer errors with source position context |
Usage Examples
#include "jinja/lexer.h"
jinja::lexer lex;
try {
auto result = lex.tokenize("Hello {{ name }}!");
for (const auto & tok : result.tokens) {
// tok.t == token::text for "Hello "
// tok.t == token::open_expression for "{{"
// tok.t == token::identifier for "name"
// tok.t == token::close_expression for "}}"
// tok.t == token::text for "!"
}
} catch (const jinja::lexer_exception & e) {
// Error with source position context
}