Implementation:Ggml org Llama cpp Tokenize Tool

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Tokenization
Last Updated	2026-02-15 00:00 GMT

Overview

CLI tool that tokenizes text input using a model's tokenizer and displays the resulting tokens with their IDs.

Description

Loads a GGUF model to access its tokenizer, reads prompt text from a CLI argument, file, or stdin. Tokenizes the input using `llama_tokenize` with configurable options (BOS token, escape sequences, special token parsing). Outputs either human-readable token strings with IDs or just numerical token IDs in a Python-parseable format `[1, 2, 3]`. Handles Windows-specific UTF-8 encoding via `CommandLineToArgvW` and supports `--show-count` to display total token count.

Usage

Use this tool for debugging tokenizer behavior, prompt engineering, tokenizer validation, and understanding how a model's tokenizer breaks down text into token boundaries.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: tools/tokenize/tokenize.cpp
Lines: 1-416

Signature

// Main entry point
int main(int argc, char ** argv);

// Utility functions
static void print_usage_information(const char * argv0);
static std::string read_prompt_from_file(const char * filepath, bool & success);
static std::vector<std::string> ingest_args(int raw_argc, char ** raw_argv);

Import

#include "common.h"
#include "llama.h"
#include <cstdio>
#include <cstring>
#include <fstream>
#include <string>
#include <vector>

I/O Contract

Inputs

Name	Type	Required	Description
-m, --model	string	Yes	Path to the GGUF model file (used for its tokenizer)
-p, --prompt	string	No	Text to tokenize (from CLI argument)
-f, --file	string	No	Path to a file containing text to tokenize
--stdin	flag	No	Read text to tokenize from standard input
--ids	flag	No	Output only numerical token IDs in Python list format
--no-bos	flag	No	Do not prepend BOS token
--no-escape	flag	No	Do not process escape sequences (\\n, \\t, etc.)
--no-parse-special	flag	No	Do not parse special/control tokens
--show-count	flag	No	Print total token count
--log-disable	flag	No	Suppress model loading log output

Outputs

Name	Type	Description
token output	stdout	Token strings with IDs, or numerical IDs in Python list format
token count	stdout	Total number of tokens (when --show-count is used)
return code	int	0 on success, 1 on error

Usage Examples

# Basic tokenization with human-readable output
./tokenize -m model.gguf -p "Hello, world!"

# Output only token IDs in Python format
./tokenize -m model.gguf -p "Hello, world!" --ids

# Tokenize from file, show count
./tokenize -m model.gguf -f input.txt --show-count

# Read from stdin
echo "Hello world" | ./tokenize -m model.gguf --stdin

Related Pages

Principle:Ggml_org_Llama_cpp_Tokenization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment