Principle:Scikit learn Scikit learn Discriminant Analysis

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Supervised Learning, Dimensionality Reduction
Last Updated	2026-02-08 15:00 GMT

Overview

Discriminant analysis methods classify observations by modeling the distribution of features within each class, using these distributions to derive optimal decision boundaries.

Description

Discriminant analysis encompasses generative classification methods that model the class-conditional distributions of features and apply Bayes' theorem to obtain posterior class probabilities. Linear Discriminant Analysis (LDA) assumes all classes share a common covariance matrix, yielding linear decision boundaries; Quadratic Discriminant Analysis (QDA) allows each class its own covariance matrix, producing quadratic boundaries. Beyond classification, LDA also serves as a supervised dimensionality reduction technique by projecting data onto the most discriminative directions. These methods bridge statistical modeling and classification, providing both predictive capability and geometric insight into class structure.

Usage

Use Linear Discriminant Analysis when classes are approximately Gaussian with similar covariance structures, when you need a linear classifier with probabilistic outputs, or when supervised dimensionality reduction is desired (projecting to at most $C - 1$ dimensions for $C$ classes). Use Quadratic Discriminant Analysis when classes have different covariance structures and the dataset is large enough to estimate per-class covariance matrices reliably. LDA is particularly useful as a preprocessing step for visualization or before applying classifiers that benefit from reduced dimensionality.

Theoretical Basis

Generative Model: Both LDA and QDA assume class-conditional Gaussian distributions:

$p (x | y = c) = 𝒩 (x | μ_{c}, Σ_{c})$

Using Bayes' theorem:

$p (y = c | x) = \frac{p (x | y = c) p (y = c)}{p (x)}$

The predicted class is $\hat{y} = \arg \max_{c} p (y = c | x)$ .

Linear Discriminant Analysis (LDA) assumes a shared covariance: $Σ_{c} = Σ$ for all classes. The log-posterior simplifies to a linear function of $x$ :

$δ_{c} (x) = x^{T} Σ^{- 1} μ_{c} - \frac{1}{2} μ_{c}^{T} Σ^{- 1} μ_{c} + \log π_{c}$

where $π_{c} = p (y = c)$ is the prior probability. The decision boundary between classes $c$ and $c^{'}$ is the set where $δ_{c} (x) = δ_{c^{'}} (x)$ , which is a hyperplane.

LDA for dimensionality reduction finds projections that maximize the ratio of between-class scatter to within-class scatter:

$w^{*} = \arg \max_{w} \frac{w^{T} S_{B} w}{w^{T} S_{W} w}$

where: $S_{B} = \sum_{c = 1}^{C} n_{c} (μ_{c} - μ) (μ_{c} - μ)^{T}$ (between-class scatter) $S_{W} = \sum_{c = 1}^{C} \sum_{i \in c} (x_{i} - μ_{c}) (x_{i} - μ_{c})^{T}$ (within-class scatter)

The solution is obtained from the generalized eigenvalue problem $S_{B} w = λ S_{W} w$ . At most $C - 1$ non-zero eigenvalues exist, limiting the reduced dimensionality to $C - 1$ .

Quadratic Discriminant Analysis (QDA) allows per-class covariance matrices $Σ_{c}$ . The log-posterior includes a quadratic term in $x$ :

$δ_{c} (x) = - \frac{1}{2} (x - μ_{c})^{T} Σ_{c}^{- 1} (x - μ_{c}) - \frac{1}{2} \log | Σ_{c} | + \log π_{c}$

This produces quadratic (curved) decision boundaries, providing more flexibility at the cost of estimating more parameters.

Related Pages

Implementation:Scikit_learn_Scikit_learn_LinearDiscriminantAnalysis

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment