Principle:Scikit learn Scikit learn Density Estimation
| Knowledge Sources | |
|---|---|
| Domains | Unsupervised Learning, Probability Theory |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Density estimation infers the underlying probability distribution of a dataset, enabling assessment of how likely new observations are under the learned distribution.
Description
Density estimation methods construct an approximation of the probability density function from observed data. They solve the fundamental problem of characterizing the distribution of data without assuming a specific parametric form (non-parametric methods) or by fitting a flexible mixture of parametric components (semi-parametric methods). Density estimation underpins anomaly detection (low-density observations are anomalous), generative modeling (sampling from the estimated density), clustering (mixture model components correspond to clusters), and statistical testing. It sits at the core of probabilistic machine learning.
Usage
Use Kernel Density Estimation (KDE) when a non-parametric estimate of the density is needed and the data is low-to-moderate dimensional. Use Gaussian Mixture Models (GMMs) when the data is believed to arise from a mixture of several Gaussian components, and when both cluster assignments and density estimates are desired. Use Bayesian Gaussian Mixture Models when the number of mixture components is uncertain and should be inferred from the data, or when a Bayesian treatment of uncertainty is preferred. KDE is well-suited for visualization and one-dimensional density estimation; GMMs scale better to moderate dimensions and naturally integrate with clustering workflows.
Theoretical Basis
Kernel Density Estimation (KDE) estimates the density at point as:
where is a kernel function (typically Gaussian), is the bandwidth, is the number of samples, and is the dimensionality. The bandwidth controls the smoothness of the estimate: too small produces a noisy estimate, too large oversmooths.
Common kernels include:
- Gaussian:
- Tophat:
- Epanechnikov:
Gaussian Mixture Model (GMM) models the density as a weighted sum of Gaussians:
where are mixing weights (), and are the mean and covariance of each component.
Parameters are estimated via the Expectation-Maximization (EM) algorithm:
- E-step: Compute responsibilities
- M-step: Update parameters:
Bayesian Gaussian Mixture Model places priors on mixture parameters (Dirichlet prior on weights, Gaussian-Wishart prior on means and covariances). Using variational inference, it can automatically determine the effective number of components by driving unnecessary component weights toward zero.