Principle:Bigscience workshop Petals Output Decoding
| Knowledge Sources | |
|---|---|
| Domains | NLP, Postprocessing |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
The process of converting generated token ID sequences back into human-readable text strings, including handling of special tokens, subword merging, and whitespace normalization.
Description
Output Decoding is the final step in any text generation pipeline. After the model produces a sequence of integer token IDs, these must be converted back to text. The decoding process reverses the tokenization encoding, merging subword tokens back into complete words and handling special tokens (BOS, EOS, PAD).
In the Petals context, decoding happens entirely on the client side using the same tokenizer instance used for encoding. The key consideration is the skip_special_tokens parameter, which controls whether model control tokens (like , ) appear in the output text.
Usage
Use this principle as the final step after autoregressive generation to produce readable text output. Always use the same tokenizer instance that was used for encoding the input.
Theoretical Basis
Subword detokenization:
The decoder reverses the BPE/WordPiece encoding:
- Map each token ID to its string representation
- Concatenate all token strings
- Apply model-specific post-processing (e.g., removing Ġ prefix characters in GPT-2 style tokenizers, or ▁ in SentencePiece)
- Optionally filter out special tokens
# Abstract decoding algorithm
token_strings = [vocab[id] for id in token_ids]
if skip_special_tokens:
token_strings = [s for s in token_strings if s not in special_tokens]
text = post_process(concatenate(token_strings))