Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:LaurentMazare Tch rs Linear Layer

From Leeroopedia


Knowledge Sources
Domains Deep_Learning, Neural_Network_Layers
Last Updated 2026-02-08 14:00 GMT

Overview

Fully-connected layer that applies an affine transformation to input data, mapping from one dimensionality to another.

Description

A linear (fully-connected) layer computes y = xW^T + b where W is a weight matrix of shape [out_dim, in_dim] and b is an optional bias vector of shape [out_dim]. It is the most fundamental building block in neural networks, used for dimensionality changes, classification heads, and feature projections. Weight initialization defaults to Kaiming uniform, while bias is initialized uniformly scaled by 1/sqrt(in_dim).

Usage

Use this principle whenever you need to transform feature vectors between dimensions: input-to-hidden, hidden-to-hidden, or hidden-to-output projections. Essential for classifier heads, MLP blocks, and any dense connection between layers.

Theoretical Basis

y=xWT+b

Where:

  • x: Input tensor of shape [batch_size, in_dim]
  • W: Weight matrix of shape [out_dim, in_dim]
  • b: Optional bias vector of shape [out_dim]
  • y: Output tensor of shape [batch_size, out_dim]

Default initialization:

  • Weight: Kaiming uniform — Failed to parse (syntax error): {\displaystyle U(-\sqrt{1/\text{in\_dim}}, \sqrt{1/\text{in\_dim}})}
  • Bias: Uniform — Failed to parse (syntax error): {\displaystyle U(-\sqrt{1/\text{in\_dim}}, \sqrt{1/\text{in\_dim}})}

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment