Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Lucidrains X transformers Encoder Decoder Generation

From Leeroopedia


Metadata

Field Value
Paper Attention Is All You Need
Repository x-transformers
Domains Deep_Learning, NLP, Inference
Last Updated 2026-02-08 18:00 GMT

Overview

Sequence generation procedure for encoder-decoder models that produces output tokens autoregressively conditioned on encoded source sequences.

Description

Generating from an encoder-decoder model involves three stages:

  • Encode the source sequence — the encoder processes the entire source sequence once, producing a set of hidden representations.
  • Initialize the decoder — the decoder is seeded with a start token (or prefix tokens) to begin generation.
  • Autoregressive decoding with cross-attention — at each step, the decoder generates the next output token while cross-attending to the encoder outputs. The token is appended to the decoder input and the process repeats.

The encoder output is computed once and reused for all decoder steps, making the procedure efficient. The decoder uses the same sampling strategies available in standalone autoregressive generation (top-k, top-p, temperature, etc.).

Usage

Use after training for inference on sequence-to-sequence tasks. Provide the source sequence and a start token. The decoder generates the output sequence token by token.

Theoretical Basis

Conditional generation:

y_t ~ P(y_t | y_{<t}, Encoder(x))

The encoder computes h = Encoder(x) once. At each decoder step, the decoder uses cross-attention to attend to h while generating tokens autoregressively:

h = Encoder(x)
y_1 ~ P(y_1 | start_token, h)
y_2 ~ P(y_2 | y_1, h)
...
y_t ~ P(y_t | y_{<t}, h)

Generation stops when the maximum length is reached or an EOS token is produced.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment