Implementation:Ggml org Llama cpp Llama Sampler Chain Init

Aspect	Detail
Implementation Name	Llama Sampler Chain Init
Doc Type	API Doc
Category	Sampling
Workflow	Interactive_Chat
Applies To	llama.cpp
Status	Active

Overview

Description

The llama_sampler_chain_init function creates a new sampler chain object, and llama_sampler_chain_add appends individual samplers to that chain. Together, these two functions form the API for building a composable token sampling pipeline. The chain itself is a llama_sampler that delegates to its child samplers in sequence: when applied, it runs each child sampler's apply method on the token data array in order, then the final sampler selects a token.

Usage

These functions are called during application initialization to build the sampling pipeline. The chain is then passed to llama_sampler_sample during the generation loop. The chain takes ownership of all added samplers and frees them when llama_sampler_free is called on the chain.

Code Reference

Attribute	Value
Source Location	`include/llama.h:1266` (chain_init), `include/llama.h:1269` (chain_add), `src/llama-sampler.cpp:792-882`
Signature (chain_init)	`struct llama_sampler * llama_sampler_chain_init(struct llama_sampler_chain_params params)`
Signature (chain_add)	`void llama_sampler_chain_add(struct llama_sampler * chain, struct llama_sampler * smpl)`
Import	`#include "llama.h"`

Supporting API:

// Get default chain parameters
struct llama_sampler_chain_params llama_sampler_chain_default_params(void);

// Create a sampler chain
struct llama_sampler * llama_sampler_chain_init(struct llama_sampler_chain_params params);

// Add a sampler to the chain (takes ownership)
void llama_sampler_chain_add(struct llama_sampler * chain, struct llama_sampler * smpl);

// Query chain contents
struct llama_sampler * llama_sampler_chain_get(struct llama_sampler * chain, int32_t i);
int llama_sampler_chain_n(const struct llama_sampler * chain);

// Remove a sampler (releases ownership)
struct llama_sampler * llama_sampler_chain_remove(struct llama_sampler * chain, int32_t i);

// Available sampler initializers
struct llama_sampler * llama_sampler_init_greedy(void);
struct llama_sampler * llama_sampler_init_dist(uint32_t seed);
struct llama_sampler * llama_sampler_init_top_k(int32_t k);
struct llama_sampler * llama_sampler_init_top_p(float p, size_t min_keep);
struct llama_sampler * llama_sampler_init_min_p(float p, size_t min_keep);
struct llama_sampler * llama_sampler_init_temp(float t);

Chain params struct:

typedef struct llama_sampler_chain_params {
    bool no_perf; // whether to measure performance timings
} llama_sampler_chain_params;

I/O Contract

Direction	Name	Type	Description
Input (chain_init)	params	`struct llama_sampler_chain_params`	Chain configuration (e.g., performance timing toggle)
Output (chain_init)	return	`struct llama_sampler *`	A new empty sampler chain
Input (chain_add)	chain	`struct llama_sampler *`	The chain to add to
Input (chain_add)	smpl	`struct llama_sampler *`	The sampler to append (ownership transferred)
Output (chain_add)	void		Sampler is appended to chain's internal list

Preconditions:

chain_init: The params struct should be obtained from llama_sampler_chain_default_params()
chain_add: The chain must have been created by llama_sampler_chain_init; the smpl must be a valid sampler that has not already been added to another chain

Postconditions:

After chain_init: Returns an empty chain with no child samplers
After chain_add: The chain owns the sampler; do not call llama_sampler_free on the added sampler individually

Ownership:

The chain takes ownership of every sampler added via chain_add
Calling llama_sampler_free(chain) frees the chain and all owned samplers
Calling llama_sampler_chain_remove(chain, i) releases ownership of sampler at index i

Usage Examples

// Initialize the sampler chain with default params
llama_sampler * smpl = llama_sampler_chain_init(llama_sampler_chain_default_params());

// Add samplers in order: filter -> temperature -> selection
llama_sampler_chain_add(smpl, llama_sampler_init_min_p(0.05f, 1));
llama_sampler_chain_add(smpl, llama_sampler_init_temp(0.8f));
llama_sampler_chain_add(smpl, llama_sampler_init_dist(LLAMA_DEFAULT_SEED));

// Use during generation loop
llama_token new_token_id = llama_sampler_sample(smpl, ctx, -1);

// Cleanup
llama_sampler_free(smpl);

Alternative: deterministic (greedy) sampling:

llama_sampler * smpl = llama_sampler_chain_init(llama_sampler_chain_default_params());
llama_sampler_chain_add(smpl, llama_sampler_init_greedy());

Alternative: top-k + top-p + temperature:

llama_sampler * smpl = llama_sampler_chain_init(llama_sampler_chain_default_params());
llama_sampler_chain_add(smpl, llama_sampler_init_top_k(50));
llama_sampler_chain_add(smpl, llama_sampler_init_top_p(0.9, 1));
llama_sampler_chain_add(smpl, llama_sampler_init_temp(0.8));
llama_sampler_chain_add(smpl, llama_sampler_init_dist(LLAMA_DEFAULT_SEED));

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment