Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Tensorflow Tfjs GPT2Backbone Constructor

From Leeroopedia


Summary

GPT2Backbone constructs the core GPT-2 transformer architecture in TensorFlow.js, consisting of token embeddings, positional embeddings, N transformer decoder blocks, and final layer normalization. It uses TransformerDecoder layers for the decoder blocks and PositionEmbedding for positional encoding.

API

new GPT2Backbone(args: GPT2BackboneArgs) + TransformerDecoder + PositionEmbedding

Source

  • tfjs-layers/src/layers/nlp/models/gpt2/gpt2_backbone.ts:L125-221 (GPT2Backbone)
  • tfjs-layers/src/layers/nlp/modeling/transformer_decoder.ts:L206-494 (TransformerDecoder)
  • tfjs-layers/src/layers/nlp/modeling/position_embedding.ts:L86-147 (PositionEmbedding)

Type

API Doc

Signatures

GPT2Backbone

interface GPT2BackboneArgs {
  vocabularySize: number;
  numLayers: number;
  numHeads: number;
  hiddenDim: number;
  intermediateDim: number;
  dropout?: number;  // default 0.1
  maxSequenceLength?: number;  // default 1024
}

class GPT2Backbone extends Backbone {
  constructor(args: GPT2BackboneArgs)
  get tokenEmbedding(): Embedding
}

TransformerDecoder

interface TransformerDecoderArgs extends LayerArgs {
  intermediateDim: number;
  numHeads: number;
  dropout?: number;
  activation?: Activation|ActivationIdentifier;
  layerNormEpsilon?: number;
  normalizeFirst?: boolean;
}

class TransformerDecoder extends Layer {
  call(decoderSequence: Tensor, kwargs: TransformerDecoderOptions): Tensor
  callAndReturnCaches(decoderSequence, kwargs): [Tensor, Tensor, Tensor]
}

PositionEmbedding

interface PositionEmbeddingArgs extends LayerArgs {
  sequenceLength: number;
  initializer?: Initializer|InitializerIdentifier;
}

class PositionEmbedding extends Layer {
  call(inputs: Tensor|Tensor[], kwargs?: PositionEmbeddingOptions): Tensor
}

Constructor Parameters

GPT2BackboneArgs

Parameter Type Default Description
vocabularySize number (required) Number of tokens in the vocabulary (e.g., 50257 for GPT-2)
numLayers number (required) Number of transformer decoder blocks
numHeads number (required) Number of attention heads per decoder block
hiddenDim number (required) Dimensionality of the hidden representations
intermediateDim number (required) Inner dimension of the feed-forward network in each block
dropout number 0.1 Dropout rate for regularization
maxSequenceLength number 1024 Maximum sequence length for positional embeddings

TransformerDecoderArgs

Parameter Type Default Description
intermediateDim number (required) Inner dimension of the feed-forward network
numHeads number (required) Number of attention heads
dropout number undefined Dropout rate
activation ActivationIdentifier undefined Activation function for the FFN
layerNormEpsilon number undefined Epsilon value for layer normalization
normalizeFirst boolean undefined Whether to apply layer norm before (pre-norm) or after sub-layers

PositionEmbeddingArgs

Parameter Type Default Description
sequenceLength number (required) Maximum sequence length for positional encoding
initializer InitializerIdentifier undefined Weight initializer for position embeddings

Properties

Property Return Type Description
tokenEmbedding Embedding The token embedding layer (used for weight tying in the LM head)

Methods

Class Method Description
TransformerDecoder call(decoderSequence, kwargs) Forward pass through a single decoder block
TransformerDecoder callAndReturnCaches(decoderSequence, kwargs) Forward pass that also returns KV caches for autoregressive generation
PositionEmbedding call(inputs, kwargs) Adds positional embeddings to input token embeddings

I/O

  • Inputs: Architecture hyperparameters (vocabulary size, number of layers, heads, dimensions)
  • Outputs: A GPT2Backbone model that accepts {token_ids, padding_mask} and produces sequence hidden states of shape [batch, seq_len, hidden_dim]

Example

const backbone = new GPT2Backbone({
  vocabularySize: 50257,
  numLayers: 12,
  numHeads: 12,
  hiddenDim: 768,
  intermediateDim: 3072,
  dropout: 0.1,
  maxSequenceLength: 1024,
});

Implements

Principle:Tensorflow_Tfjs_Transformer_Backbone_Construction

Environment:Tensorflow_Tfjs_Browser_Runtime

Domains

NLP Transformer_Architecture

Sources

TensorFlow.js

Related Pages

Environments

Metadata

2026-02-10 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment