Implementation:Ggml org Llama cpp Perplexity CLI Args
| Aspect | Detail |
|---|---|
| Implementation Name | Perplexity CLI Args |
| Doc Type | Wrapper Doc |
| Domain | Model Perplexity Evaluation |
| Purpose | CLI argument parsing for perplexity evaluation |
| Related Workflow | Model_Perplexity_Evaluation |
Overview
Description
This implementation documents the command-line argument definitions for the llama-perplexity tool. Arguments are defined in common/arg.cpp using the common_arg registration system and are filtered to the LLAMA_EXAMPLE_PERPLEXITY example type. These arguments control evaluation mode selection, task count configuration, output format, and stride-based computation.
Usage
Arguments are specified on the command line when invoking llama-perplexity. They are parsed by common_params_parse() in the tool's main() function and stored in a common_params structure that is passed to the evaluation functions.
Code Reference
| Aspect | Detail |
|---|---|
| Source Location (chunks) | common/arg.cpp:1314
|
| Source Location (eval args) | common/arg.cpp:2015-2076
|
| Signature | N/A (argument registration, not a callable function) |
| Import | #include "arg.h"
|
Chunk count argument (common/arg.cpp:1314):
add_opt(common_arg(
{"--chunks"}, "N",
string_format("max number of chunks to process (default: %d, -1 = all)", params.n_chunks),
[](common_params & params, int value) {
params.n_chunks = value;
}
).set_examples({LLAMA_EXAMPLE_IMATRIX, LLAMA_EXAMPLE_PERPLEXITY, LLAMA_EXAMPLE_RETRIEVAL}));
Evaluation mode arguments (common/arg.cpp:2015-2076):
add_opt(common_arg(
{"--hellaswag"},
"compute HellaSwag score over random tasks from datafile supplied with -f",
[](common_params & params) {
params.hellaswag = true;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--hellaswag-tasks"}, "N",
string_format("number of tasks to use when computing the HellaSwag score (default: %zu)",
params.hellaswag_tasks),
[](common_params & params, int value) {
params.hellaswag_tasks = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--winogrande"},
"compute Winogrande score over random tasks from datafile supplied with -f",
[](common_params & params) {
params.winogrande = true;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--winogrande-tasks"}, "N",
string_format("number of tasks to use when computing the Winogrande score (default: %zu)",
params.winogrande_tasks),
[](common_params & params, int value) {
params.winogrande_tasks = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--multiple-choice"},
"compute multiple choice score over random tasks from datafile supplied with -f",
[](common_params & params) {
params.multiple_choice = true;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--multiple-choice-tasks"}, "N",
string_format("number of tasks to use when computing the multiple choice score (default: %zu)",
params.multiple_choice_tasks),
[](common_params & params, int value) {
params.multiple_choice_tasks = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--kl-divergence"},
"computes KL-divergence to logits provided via --kl-divergence-base",
[](common_params & params) {
params.kl_divergence = true;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--save-all-logits", "--kl-divergence-base"}, "FNAME",
"set logits file",
[](common_params & params, const std::string & value) {
params.logits_file = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--ppl-stride"}, "N",
string_format("stride for perplexity calculation (default: %d)", params.ppl_stride),
[](common_params & params, int value) {
params.ppl_stride = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
add_opt(common_arg(
{"--ppl-output-type"}, "<0|1>",
string_format("output type for perplexity calculation (default: %d)", params.ppl_output_type),
[](common_params & params, int value) {
params.ppl_output_type = value;
}
).set_examples({LLAMA_EXAMPLE_PERPLEXITY}));
I/O Contract
| Direction | Name | CLI Flag | Type | Description |
|---|---|---|---|---|
| Input | hellaswag | --hellaswag |
bool | Enable HellaSwag evaluation mode |
| Input | hellaswag_tasks | --hellaswag-tasks N |
int | Number of HellaSwag tasks to evaluate |
| Input | winogrande | --winogrande |
bool | Enable Winogrande evaluation mode |
| Input | winogrande_tasks | --winogrande-tasks N |
int | Number of Winogrande tasks to evaluate |
| Input | multiple_choice | --multiple-choice |
bool | Enable multiple choice evaluation mode |
| Input | multiple_choice_tasks | --multiple-choice-tasks N |
int | Number of multiple choice tasks |
| Input | kl_divergence | --kl-divergence |
bool | Enable KL divergence computation mode |
| Input | logits_file | --save-all-logits FNAME / --kl-divergence-base FNAME |
string | Path to logits file (save or load) |
| Input | ppl_stride | --ppl-stride N |
int | Stride for sliding-window perplexity (0 = disabled) |
| Input | ppl_output_type | 1> | int | Output format (0 = compact, 1 = verbose) |
| Input | n_chunks | --chunks N |
int | Max chunks to process (-1 = all) |
| Output | params | common_params |
Populated parameter structure passed to evaluation functions |
Usage Examples
Example 1: Standard perplexity with all options
./llama-perplexity -m model.gguf \
-f wikitext-2-raw/wiki.test.raw \
--ctx-size 512 \
--batch-size 2048 \
--chunks 50 \
--ppl-output-type 1 \
-ngl 35
Example 2: HellaSwag evaluation
./llama-perplexity -m model.gguf \
-f hellaswag_val_full.txt \
--hellaswag \
--hellaswag-tasks 10042
Example 3: KL divergence workflow
# Step 1: Save reference logits from FP16 model
./llama-perplexity -m model-f16.gguf \
-f wikitext-2-raw/wiki.test.raw \
--save-all-logits logits-f16.bin
# Step 2: Compute KL divergence of quantized model
./llama-perplexity -m model-q4.gguf \
-f wikitext-2-raw/wiki.test.raw \
--kl-divergence \
--kl-divergence-base logits-f16.bin
Example 4: Stride-based perplexity
./llama-perplexity -m model.gguf \
-f wikitext-2-raw/wiki.test.raw \
--ppl-stride 256 \
--ctx-size 512