Implementation:Ggml org Llama cpp Perplexity CLI Args

Aspect	Detail
Implementation Name	Perplexity CLI Args
Doc Type	Wrapper Doc
Domain	Model Perplexity Evaluation
Purpose	CLI argument parsing for perplexity evaluation
Related Workflow	Model_Perplexity_Evaluation

Overview

Description

This implementation documents the command-line argument definitions for the llama-perplexity tool. Arguments are defined in common/arg.cpp using the common_arg registration system and are filtered to the LLAMA_EXAMPLE_PERPLEXITY example type. These arguments control evaluation mode selection, task count configuration, output format, and stride-based computation.

Usage

Arguments are specified on the command line when invoking llama-perplexity. They are parsed by common_params_parse() in the tool's main() function and stored in a common_params structure that is passed to the evaluation functions.

Code Reference

Aspect	Detail
Source Location (chunks)	`common/arg.cpp:1314`
Source Location (eval args)	`common/arg.cpp:2015-2076`
Signature	N/A (argument registration, not a callable function)
Import	`#include "arg.h"`

Chunk count argument (common/arg.cpp:1314):

add_opt(common_arg(
    {"--chunks"}, "N",
    string_format("max number of chunks to process (default: %d, -1 = all)", params.n_chunks),
    [](common_params & params, int value) {
        params.n_chunks = value;
    }
).set_examples({LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY, LLAMA_EXAMPLE_RETRIEVAL}));

Evaluation mode arguments (common/arg.cpp:2015-2076):

add_opt(common_arg(
    {"--hellaswag"},
    "compute HellaSwag score over random tasks from datafile supplied with -f",
    [](common_params & params) {
        params.hellaswag = true;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--hellaswag-tasks"}, "N",
    string_format("number of tasks to use when computing the HellaSwag score (default: %zu)",
                  params.hellaswag_tasks),
    [](common_params & params, int value) {
        params.hellaswag_tasks = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--winogrande"},
    "compute Winogrande score over random tasks from datafile supplied with -f",
    [](common_params & params) {
        params.winogrande = true;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--winogrande-tasks"}, "N",
    string_format("number of tasks to use when computing the Winogrande score (default: %zu)",
                  params.winogrande_tasks),
    [](common_params & params, int value) {
        params.winogrande_tasks = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--multiple-choice"},
    "compute multiple choice score over random tasks from datafile supplied with -f",
    [](common_params & params) {
        params.multiple_choice = true;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--multiple-choice-tasks"}, "N",
    string_format("number of tasks to use when computing the multiple choice score (default: %zu)",
                  params.multiple_choice_tasks),
    [](common_params & params, int value) {
        params.multiple_choice_tasks = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--kl-divergence"},
    "computes KL-divergence to logits provided via --kl-divergence-base",
    [](common_params & params) {
        params.kl_divergence = true;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--save-all-logits", "--kl-divergence-base"}, "FNAME",
    "set logits file",
    [](common_params & params, const std::string & value) {
        params.logits_file = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--ppl-stride"}, "N",
    string_format("stride for perplexity calculation (default: %d)", params.ppl_stride),
    [](common_params & params, int value) {
        params.ppl_stride = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

add_opt(common_arg(
    {"--ppl-output-type"}, "<0|1>",
    string_format("output type for perplexity calculation (default: %d)", params.ppl_output_type),
    [](common_params & params, int value) {
        params.ppl_output_type = value;
    }
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));

I/O Contract

Direction	Name	CLI Flag	Type	Description
Input	hellaswag	`--hellaswag`	bool	Enable HellaSwag evaluation mode
Input	hellaswag_tasks	`--hellaswag-tasks N`	int	Number of HellaSwag tasks to evaluate
Input	winogrande	`--winogrande`	bool	Enable Winogrande evaluation mode
Input	winogrande_tasks	`--winogrande-tasks N`	int	Number of Winogrande tasks to evaluate
Input	multiple_choice	`--multiple-choice`	bool	Enable multiple choice evaluation mode
Input	multiple_choice_tasks	`--multiple-choice-tasks N`	int	Number of multiple choice tasks
Input	kl_divergence	`--kl-divergence`	bool	Enable KL divergence computation mode
Input	logits_file	`--save-all-logits FNAME` / `--kl-divergence-base FNAME`	string	Path to logits file (save or load)
Input	ppl_stride	`--ppl-stride N`	int	Stride for sliding-window perplexity (0 = disabled)
Input	ppl_output_type	1>	int	Output format (0 = compact, 1 = verbose)
Input	n_chunks	`--chunks N`	int	Max chunks to process (-1 = all)
Output	params		`common_params`	Populated parameter structure passed to evaluation functions

Usage Examples

Example 1: Standard perplexity with all options

./llama-perplexity -m model.gguf \
    -f wikitext-2-raw/wiki.test.raw \
    --ctx-size 512 \
    --batch-size 2048 \
    --chunks 50 \
    --ppl-output-type 1 \
    -ngl 35

Example 2: HellaSwag evaluation

./llama-perplexity -m model.gguf \
    -f hellaswag_val_full.txt \
    --hellaswag \
    --hellaswag-tasks 10042

Example 3: KL divergence workflow

# Step 1: Save reference logits from FP16 model
./llama-perplexity -m model-f16.gguf \
    -f wikitext-2-raw/wiki.test.raw \
    --save-all-logits logits-f16.bin

# Step 2: Compute KL divergence of quantized model
./llama-perplexity -m model-q4.gguf \
    -f wikitext-2-raw/wiki.test.raw \
    --kl-divergence \
    --kl-divergence-base logits-f16.bin

Example 4: Stride-based perplexity

./llama-perplexity -m model.gguf \
    -f wikitext-2-raw/wiki.test.raw \
    --ppl-stride 256 \
    --ctx-size 512

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment