Implementation:Ollama Ollama Sampler Sample

Knowledge Sources	Ollama
Domains	NLP, Probability, Inference
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for sampling tokens from model logits provided by the sample package.

Description

The Sampler struct encapsulates the full token sampling pipeline. NewSampler constructs a sampler with the specified parameters (temperature, topK, topP, minP, seed, grammar). Sample takes raw logit values from the model and returns a single token ID.

When temperature is 0, the sampler uses greedy decoding (argmax). Otherwise, it applies the full pipeline: top-k sort and filter, temperature scaling, softmax normalization, top-p filtering, min-p thresholding, and weighted random selection.

For grammar-constrained generation, the sampler first attempts the grammar on the top token. If rejected, it applies grammar masking to all tokens and re-samples.

The transform functions (temperature, softmax, topK, topP, minP) are defined in sample/transforms.go.

Usage

Used internally by the inference server when generating tokens. A new Sampler is created per inference request based on the API options.

Code Reference

Source Location

Repository: ollama
File: sample/samplers.go (Sampler, Sample, NewSampler), sample/transforms.go (temperature, topK, topP, minP, softmax)
Lines: samplers.go:L28-70 (Sample), samplers.go:L130-165 (NewSampler), transforms.go:L29-130 (transforms)

Signature

func (s *Sampler) Sample(logits []float32) (int32, error)

func NewSampler(
    temperature float32,
    topK int,
    topP float32,
    minP float32,
    seed int,
    grammar *GrammarSampler,
) Sampler

Import

import "github.com/ollama/ollama/sample"

I/O Contract

Inputs (Sample)

Name	Type	Required	Description
logits	[]float32	Yes	Raw logit vector from model, one value per vocabulary token

Inputs (NewSampler)

Name	Type	Required	Description
temperature	float32	Yes	Sampling temperature (0 = greedy, higher = more random)
topK	int	Yes	Number of top tokens to consider (0 = disabled)
topP	float32	Yes	Nucleus sampling threshold (1.0 = disabled)
minP	float32	Yes	Minimum probability threshold (0.0 = disabled)
seed	int	Yes	Random seed (-1 = non-deterministic)
grammar	*GrammarSampler	No	Grammar constraint for structured output

Outputs

Name	Type	Description
token ID	int32	Selected token index into the vocabulary
error	error	Non-nil if logits are empty or sum to NaN

Usage Examples

Creating and Using a Sampler

import "github.com/ollama/ollama/sample"

// Create a sampler with typical chat settings
sampler := sample.NewSampler(
    0.7,   // temperature
    40,    // topK
    0.9,   // topP
    0.0,   // minP (disabled)
    -1,    // seed (random)
    nil,   // no grammar
)

// Sample from model output logits
logits := model.Forward(tokens) // []float32 of vocab size
tokenID, err := sampler.Sample(logits)
if err != nil {
    // handle error
}

Grammar-Constrained Sampling

import "github.com/ollama/ollama/sample"

// Create a grammar sampler for JSON output
grammar, err := sample.NewGrammarSampler(tokenizer, jsonBNFGrammar)
if err != nil {
    // handle error
}
defer grammar.Free()

sampler := sample.NewSampler(0.0, 0, 1.0, 0.0, -1, grammar)
tokenID, err := sampler.Sample(logits)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment