Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Deepspeedai DeepSpeed LayerSpec Init

From Leeroopedia


Overview

Concrete tool for deferred layer construction in pipeline-parallel models provided by the DeepSpeed library. LayerSpec stores a layer class and its constructor arguments without instantiating the layer. TiedLayerSpec extends this with a key parameter for weight tying across stages.

Description

LayerSpec stores a layer class and its constructor arguments without instantiating the layer. When build() is called (by PipelineModule during stage assignment), it constructs the actual nn.Module. The class validates at creation time that the provided typename is a subclass of torch.nn.Module, raising a RuntimeError if not.

TiedLayerSpec extends LayerSpec with additional attributes for weight tying:

  • key: A string identifier shared by all positions where this tied module appears.
  • forward_fn: An optional custom forward function with signature (module, input).
  • tied_weight_attr: The attribute name(s) of the weight to tie (defaults to ['weight']).

Code Reference

LayerSpec signature:

class LayerSpec:
    def __init__(self, typename, *module_args, **module_kwargs)

TiedLayerSpec signature:

class TiedLayerSpec(LayerSpec):
    def __init__(self, key, typename, *module_args, forward_fn=None,
                 tied_weight_attr=['weight'], **module_kwargs)

Import:

from deepspeed.pipe import LayerSpec, TiedLayerSpec

I/O Contract

LayerSpec Inputs

Parameter Type Required Description
typename type Yes An nn.Module subclass to be constructed
*module_args positional No Positional arguments passed to typename.__init__
**module_kwargs keyword No Keyword arguments passed to typename.__init__

TiedLayerSpec Additional Inputs

Parameter Type Required Description
key str Yes Identifier for the tied weight group
forward_fn callable No Custom forward function with signature (module, input)
tied_weight_attr str or list No Weight attribute name(s) to tie, default ['weight']

Outputs

Output Type Description
LayerSpec object LayerSpec Specification object; calling .build() returns the constructed nn.Module

Usage Example

from deepspeed.pipe import LayerSpec, TiedLayerSpec
import torch.nn as nn

vocab_size = 30000
hidden_size = 1024

# Basic layer specification without instantiation
layers = [
    LayerSpec(nn.Embedding, vocab_size, hidden_size),
    LayerSpec(nn.Linear, hidden_size, hidden_size),
    LayerSpec(nn.Linear, hidden_size, hidden_size),
    LayerSpec(nn.Linear, hidden_size, vocab_size),
]

# Weight tying between embedding and output projection
def output_forward(module, input):
    return torch.nn.functional.linear(input, module.weight)

tied_layers = [
    TiedLayerSpec("embed", nn.Embedding, vocab_size, hidden_size),
    LayerSpec(nn.Linear, hidden_size, hidden_size),
    LayerSpec(nn.Linear, hidden_size, hidden_size),
    TiedLayerSpec("embed", nn.Embedding, vocab_size, hidden_size,
                  forward_fn=output_forward),
]

Related Pages

Knowledge Sources

Last updated: 2026-02-09 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment