Implementation:Ollama Ollama Sampling Ext

Knowledge Sources	Ollama
Domains	CGoBridge, Sampling
Last Updated	2025-02-15 00:00 GMT

Overview

C-compatible wrapper around llama.cpp's C++ sampling, grammar, and JSON schema conversion APIs, enabling CGo interop from Ollama's Go code.

Description

Provides C-linkage wrapper functions that bridge Go's CGo layer to llama.cpp's C++ APIs. common_sampler_cinit creates a sampler from a flat C params struct by mapping fields (top_k, top_p, min_p, temp, penalties, seed, grammar) to common_params_sampling. common_sampler_csample/caccept/creset/cfree wrap sampler lifecycle operations. schema_to_grammar converts a JSON schema string to a GBNF grammar string using nlohmann JSON parsing and json_schema_to_grammar. llama_load_vocab_from_file/llama_free_vocab load vocabulary from GGUF files. grammar_init/grammar_free/grammar_apply/grammar_accept manage grammar-constrained decoding with custom vocabulary handling via ollama_vocab. All functions use try/catch blocks to safely convert C++ exceptions to null returns or zero-length results.

Usage

Called by Ollama's Go code (llama.go) through CGo to access sampling, grammar enforcement, and schema-to-grammar conversion without directly calling C++ APIs.

Code Reference

Source Location

Repository: Ollama
File: llama/sampling_ext.cpp
Lines: 1-136

Signature

struct common_sampler *common_sampler_cinit(const struct llama_model *model,
    struct common_sampler_cparams *params);
void common_sampler_cfree(struct common_sampler *sampler);
void common_sampler_creset(struct common_sampler *sampler);
void common_sampler_caccept(struct common_sampler *sampler, llama_token id,
    bool apply_grammar);
llama_token common_sampler_csample(struct common_sampler *sampler,
    struct llama_context *ctx, int idx);

int schema_to_grammar(const char *json_schema, char *grammar, size_t max_len);

struct llama_vocab * llama_load_vocab_from_file(const char * fname);
void llama_free_vocab(struct llama_vocab * vocab);

struct llama_grammar *grammar_init(char* grammar, uint32_t* tokens,
    size_t n_tokens, const char** pieces, uint32_t* eog_tokens,
    size_t n_eog_tokens);
void grammar_free(struct llama_grammar *g);
void grammar_apply(struct llama_grammar *g, struct llama_token_data_array *tokens);
void grammar_accept(struct llama_grammar *g, llama_token id);

Import

#include "sampling_ext.h"
#include "sampling.h"
#include "json-schema-to-grammar.h"
#include "llama.h"

I/O Contract

Inputs

Name	Type	Required	Description
model	llama_model *	Yes	Loaded LLM model for sampler initialization
params	common_sampler_cparams *	Yes	Flat C struct with sampling parameters (top_k, top_p, temp, etc.)
json_schema	const char *	Yes	JSON schema string for grammar generation
grammar	char *	Yes	GBNF grammar string for constrained decoding
tokens	uint32_t *	Yes	Token IDs for vocabulary construction
pieces	const char **	Yes	Token piece strings for vocabulary construction

Outputs

Name	Type	Description
common_sampler *	pointer	Initialized sampler, or nullptr on failure
grammar_str	char *	Generated GBNF grammar string from JSON schema
llama_grammar *	pointer	Initialized grammar for constrained decoding
sampled_token	llama_token	Next token selected by the sampler

Usage Examples

// Initialize sampler from CGo
common_sampler_cparams params = {};
params.top_k = 40;
params.top_p = 0.9f;
params.temp = 0.8f;
params.seed = 42;
common_sampler * sampler = common_sampler_cinit(model, &params);

// Sample a token
llama_token id = common_sampler_csample(sampler, ctx, -1);
common_sampler_caccept(sampler, id, true);

// Convert JSON schema to grammar
char grammar[4096];
int len = schema_to_grammar("{\"type\":\"object\"}", grammar, 4096);

common_sampler_cfree(sampler);

Related Pages

Principle:Ollama_Ollama_CGoBridge

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment