Principle:Scikit learn Scikit learn Discriminant Analysis
| Knowledge Sources | |
|---|---|
| Domains | Supervised Learning, Dimensionality Reduction |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Discriminant analysis methods classify observations by modeling the distribution of features within each class, using these distributions to derive optimal decision boundaries.
Description
Discriminant analysis encompasses generative classification methods that model the class-conditional distributions of features and apply Bayes' theorem to obtain posterior class probabilities. Linear Discriminant Analysis (LDA) assumes all classes share a common covariance matrix, yielding linear decision boundaries; Quadratic Discriminant Analysis (QDA) allows each class its own covariance matrix, producing quadratic boundaries. Beyond classification, LDA also serves as a supervised dimensionality reduction technique by projecting data onto the most discriminative directions. These methods bridge statistical modeling and classification, providing both predictive capability and geometric insight into class structure.
Usage
Use Linear Discriminant Analysis when classes are approximately Gaussian with similar covariance structures, when you need a linear classifier with probabilistic outputs, or when supervised dimensionality reduction is desired (projecting to at most dimensions for classes). Use Quadratic Discriminant Analysis when classes have different covariance structures and the dataset is large enough to estimate per-class covariance matrices reliably. LDA is particularly useful as a preprocessing step for visualization or before applying classifiers that benefit from reduced dimensionality.
Theoretical Basis
Generative Model: Both LDA and QDA assume class-conditional Gaussian distributions:
Using Bayes' theorem:
The predicted class is .
Linear Discriminant Analysis (LDA) assumes a shared covariance: for all classes. The log-posterior simplifies to a linear function of :
where is the prior probability. The decision boundary between classes and is the set where , which is a hyperplane.
LDA for dimensionality reduction finds projections that maximize the ratio of between-class scatter to within-class scatter:
where: (between-class scatter) (within-class scatter)
The solution is obtained from the generalized eigenvalue problem . At most non-zero eigenvalues exist, limiting the reduced dimensionality to .
Quadratic Discriminant Analysis (QDA) allows per-class covariance matrices . The log-posterior includes a quadratic term in :
This produces quadratic (curved) decision boundaries, providing more flexibility at the cost of estimating more parameters.