Implementation:Deepspeedai DeepSpeed LayerSpec Init
Overview
Concrete tool for deferred layer construction in pipeline-parallel models provided by the DeepSpeed library. LayerSpec stores a layer class and its constructor arguments without instantiating the layer. TiedLayerSpec extends this with a key parameter for weight tying across stages.
Description
LayerSpec stores a layer class and its constructor arguments without instantiating the layer. When build() is called (by PipelineModule during stage assignment), it constructs the actual nn.Module. The class validates at creation time that the provided typename is a subclass of torch.nn.Module, raising a RuntimeError if not.
TiedLayerSpec extends LayerSpec with additional attributes for weight tying:
- key: A string identifier shared by all positions where this tied module appears.
- forward_fn: An optional custom forward function with signature
(module, input). - tied_weight_attr: The attribute name(s) of the weight to tie (defaults to
['weight']).
Code Reference
- Repository: https://github.com/deepspeedai/DeepSpeed
- File:
deepspeed/runtime/pipe/module.py - Lines: L30-84
LayerSpec signature:
class LayerSpec:
def __init__(self, typename, *module_args, **module_kwargs)
TiedLayerSpec signature:
class TiedLayerSpec(LayerSpec):
def __init__(self, key, typename, *module_args, forward_fn=None,
tied_weight_attr=['weight'], **module_kwargs)
Import:
from deepspeed.pipe import LayerSpec, TiedLayerSpec
I/O Contract
LayerSpec Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| typename | type | Yes | An nn.Module subclass to be constructed
|
| *module_args | positional | No | Positional arguments passed to typename.__init__
|
| **module_kwargs | keyword | No | Keyword arguments passed to typename.__init__
|
TiedLayerSpec Additional Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| key | str | Yes | Identifier for the tied weight group |
| forward_fn | callable | No | Custom forward function with signature (module, input)
|
| tied_weight_attr | str or list | No | Weight attribute name(s) to tie, default ['weight']
|
Outputs
| Output | Type | Description |
|---|---|---|
| LayerSpec object | LayerSpec | Specification object; calling .build() returns the constructed nn.Module
|
Usage Example
from deepspeed.pipe import LayerSpec, TiedLayerSpec
import torch.nn as nn
vocab_size = 30000
hidden_size = 1024
# Basic layer specification without instantiation
layers = [
LayerSpec(nn.Embedding, vocab_size, hidden_size),
LayerSpec(nn.Linear, hidden_size, hidden_size),
LayerSpec(nn.Linear, hidden_size, hidden_size),
LayerSpec(nn.Linear, hidden_size, vocab_size),
]
# Weight tying between embedding and output projection
def output_forward(module, input):
return torch.nn.functional.linear(input, module.weight)
tied_layers = [
TiedLayerSpec("embed", nn.Embedding, vocab_size, hidden_size),
LayerSpec(nn.Linear, hidden_size, hidden_size),
LayerSpec(nn.Linear, hidden_size, hidden_size),
TiedLayerSpec("embed", nn.Embedding, vocab_size, hidden_size,
forward_fn=output_forward),
]
Related Pages
- Principle:Deepspeedai_DeepSpeed_Pipeline_Layer_Specification
- Implementation:Deepspeedai_DeepSpeed_PipelineModule_Init
Knowledge Sources
Last updated: 2026-02-09 00:00 GMT