Principle:Bigscience workshop Petals Output Decoding

Knowledge Sources	HuggingFace Tokenizers
Domains	NLP, Postprocessing
Last Updated	2026-02-09 14:00 GMT

Overview

The process of converting generated token ID sequences back into human-readable text strings, including handling of special tokens, subword merging, and whitespace normalization.

Description

Output Decoding is the final step in any text generation pipeline. After the model produces a sequence of integer token IDs, these must be converted back to text. The decoding process reverses the tokenization encoding, merging subword tokens back into complete words and handling special tokens (BOS, EOS, PAD).

In the Petals context, decoding happens entirely on the client side using the same tokenizer instance used for encoding. The key consideration is the skip_special_tokens parameter, which controls whether model control tokens (like , ) appear in the output text.

Usage

Use this principle as the final step after autoregressive generation to produce readable text output. Always use the same tokenizer instance that was used for encoding the input.

Theoretical Basis

Subword detokenization:

The decoder reverses the BPE/WordPiece encoding:

Map each token ID to its string representation
Concatenate all token strings
Apply model-specific post-processing (e.g., removing Ġ prefix characters in GPT-2 style tokenizers, or ▁ in SentencePiece)
Optionally filter out special tokens

# Abstract decoding algorithm
token_strings = [vocab[id] for id in token_ids]
if skip_special_tokens:
    token_strings = [s for s in token_strings if s not in special_tokens]
text = post_process(concatenate(token_strings))

Related Pages

Implemented By

Implementation:Bigscience_workshop_Petals_Tokenizer_Decode

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment