Implementation:Ollama Ollama Sampler Sample
| Knowledge Sources | |
|---|---|
| Domains | NLP, Probability, Inference |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for sampling tokens from model logits provided by the sample package.
Description
The Sampler struct encapsulates the full token sampling pipeline. NewSampler constructs a sampler with the specified parameters (temperature, topK, topP, minP, seed, grammar). Sample takes raw logit values from the model and returns a single token ID.
When temperature is 0, the sampler uses greedy decoding (argmax). Otherwise, it applies the full pipeline: top-k sort and filter, temperature scaling, softmax normalization, top-p filtering, min-p thresholding, and weighted random selection.
For grammar-constrained generation, the sampler first attempts the grammar on the top token. If rejected, it applies grammar masking to all tokens and re-samples.
The transform functions (temperature, softmax, topK, topP, minP) are defined in sample/transforms.go.
Usage
Used internally by the inference server when generating tokens. A new Sampler is created per inference request based on the API options.
Code Reference
Source Location
- Repository: ollama
- File: sample/samplers.go (Sampler, Sample, NewSampler), sample/transforms.go (temperature, topK, topP, minP, softmax)
- Lines: samplers.go:L28-70 (Sample), samplers.go:L130-165 (NewSampler), transforms.go:L29-130 (transforms)
Signature
func (s *Sampler) Sample(logits []float32) (int32, error)
func NewSampler(
temperature float32,
topK int,
topP float32,
minP float32,
seed int,
grammar *GrammarSampler,
) Sampler
Import
import "github.com/ollama/ollama/sample"
I/O Contract
Inputs (Sample)
| Name | Type | Required | Description |
|---|---|---|---|
| logits | []float32 | Yes | Raw logit vector from model, one value per vocabulary token |
Inputs (NewSampler)
| Name | Type | Required | Description |
|---|---|---|---|
| temperature | float32 | Yes | Sampling temperature (0 = greedy, higher = more random) |
| topK | int | Yes | Number of top tokens to consider (0 = disabled) |
| topP | float32 | Yes | Nucleus sampling threshold (1.0 = disabled) |
| minP | float32 | Yes | Minimum probability threshold (0.0 = disabled) |
| seed | int | Yes | Random seed (-1 = non-deterministic) |
| grammar | *GrammarSampler | No | Grammar constraint for structured output |
Outputs
| Name | Type | Description |
|---|---|---|
| token ID | int32 | Selected token index into the vocabulary |
| error | error | Non-nil if logits are empty or sum to NaN |
Usage Examples
Creating and Using a Sampler
import "github.com/ollama/ollama/sample"
// Create a sampler with typical chat settings
sampler := sample.NewSampler(
0.7, // temperature
40, // topK
0.9, // topP
0.0, // minP (disabled)
-1, // seed (random)
nil, // no grammar
)
// Sample from model output logits
logits := model.Forward(tokens) // []float32 of vocab size
tokenID, err := sampler.Sample(logits)
if err != nil {
// handle error
}
Grammar-Constrained Sampling
import "github.com/ollama/ollama/sample"
// Create a grammar sampler for JSON output
grammar, err := sample.NewGrammarSampler(tokenizer, jsonBNFGrammar)
if err != nil {
// handle error
}
defer grammar.Free()
sampler := sample.NewSampler(0.0, 0, 1.0, 0.0, -1, grammar)
tokenID, err := sampler.Sample(logits)