Implementation:Ollama Ollama Sampling Ext
| Knowledge Sources | |
|---|---|
| Domains | CGoBridge, Sampling |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
C-compatible wrapper around llama.cpp's C++ sampling, grammar, and JSON schema conversion APIs, enabling CGo interop from Ollama's Go code.
Description
Provides C-linkage wrapper functions that bridge Go's CGo layer to llama.cpp's C++ APIs. common_sampler_cinit creates a sampler from a flat C params struct by mapping fields (top_k, top_p, min_p, temp, penalties, seed, grammar) to common_params_sampling. common_sampler_csample/caccept/creset/cfree wrap sampler lifecycle operations. schema_to_grammar converts a JSON schema string to a GBNF grammar string using nlohmann JSON parsing and json_schema_to_grammar. llama_load_vocab_from_file/llama_free_vocab load vocabulary from GGUF files. grammar_init/grammar_free/grammar_apply/grammar_accept manage grammar-constrained decoding with custom vocabulary handling via ollama_vocab. All functions use try/catch blocks to safely convert C++ exceptions to null returns or zero-length results.
Usage
Called by Ollama's Go code (llama.go) through CGo to access sampling, grammar enforcement, and schema-to-grammar conversion without directly calling C++ APIs.
Code Reference
Source Location
- Repository: Ollama
- File: llama/sampling_ext.cpp
- Lines: 1-136
Signature
struct common_sampler *common_sampler_cinit(const struct llama_model *model,
struct common_sampler_cparams *params);
void common_sampler_cfree(struct common_sampler *sampler);
void common_sampler_creset(struct common_sampler *sampler);
void common_sampler_caccept(struct common_sampler *sampler, llama_token id,
bool apply_grammar);
llama_token common_sampler_csample(struct common_sampler *sampler,
struct llama_context *ctx, int idx);
int schema_to_grammar(const char *json_schema, char *grammar, size_t max_len);
struct llama_vocab * llama_load_vocab_from_file(const char * fname);
void llama_free_vocab(struct llama_vocab * vocab);
struct llama_grammar *grammar_init(char* grammar, uint32_t* tokens,
size_t n_tokens, const char** pieces, uint32_t* eog_tokens,
size_t n_eog_tokens);
void grammar_free(struct llama_grammar *g);
void grammar_apply(struct llama_grammar *g, struct llama_token_data_array *tokens);
void grammar_accept(struct llama_grammar *g, llama_token id);
Import
#include "sampling_ext.h"
#include "sampling.h"
#include "json-schema-to-grammar.h"
#include "llama.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | llama_model * | Yes | Loaded LLM model for sampler initialization |
| params | common_sampler_cparams * | Yes | Flat C struct with sampling parameters (top_k, top_p, temp, etc.) |
| json_schema | const char * | Yes | JSON schema string for grammar generation |
| grammar | char * | Yes | GBNF grammar string for constrained decoding |
| tokens | uint32_t * | Yes | Token IDs for vocabulary construction |
| pieces | const char ** | Yes | Token piece strings for vocabulary construction |
Outputs
| Name | Type | Description |
|---|---|---|
| common_sampler * | pointer | Initialized sampler, or nullptr on failure |
| grammar_str | char * | Generated GBNF grammar string from JSON schema |
| llama_grammar * | pointer | Initialized grammar for constrained decoding |
| sampled_token | llama_token | Next token selected by the sampler |
Usage Examples
// Initialize sampler from CGo
common_sampler_cparams params = {};
params.top_k = 40;
params.top_p = 0.9f;
params.temp = 0.8f;
params.seed = 42;
common_sampler * sampler = common_sampler_cinit(model, ¶ms);
// Sample a token
llama_token id = common_sampler_csample(sampler, ctx, -1);
common_sampler_caccept(sampler, id, true);
// Convert JSON schema to grammar
char grammar[4096];
int len = schema_to_grammar("{\"type\":\"object\"}", grammar, 4096);
common_sampler_cfree(sampler);