Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn KernelDensity

From Leeroopedia
Revision as of 16:35, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Scikit_learn_Scikit_learn_KernelDensity.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Machine Learning, Density Estimation
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for kernel density estimation provided by scikit-learn.

Description

KernelDensity implements kernel density estimation (KDE), a non-parametric method for estimating the probability density function of a random variable. It supports multiple kernel functions (gaussian, tophat, epanechnikov, exponential, linear, cosine) and tree-based algorithms (ball_tree, kd_tree) for efficient computation. The bandwidth parameter controls the smoothness of the resulting density estimate.

Usage

Use KernelDensity when you need to estimate the underlying probability distribution of data without assuming a specific parametric form. It is commonly used for density visualization, anomaly detection, and generating synthetic samples.

Code Reference

Source Location

Signature

class KernelDensity(BaseEstimator):
    def __init__(
        self,
        *,
        bandwidth=1.0,
        algorithm="auto",
        kernel="gaussian",
        metric="euclidean",
        atol=0,
        rtol=0,
        breadth_first=True,
        leaf_size=40,
        metric_params=None,
    ):

Import

from sklearn.neighbors import KernelDensity

I/O Contract

Inputs

Name Type Required Description
bandwidth float or str No Bandwidth of the kernel; can be float or 'scott'/'silverman' (default=1.0)
algorithm str No Tree algorithm to use: 'kd_tree', 'ball_tree', or 'auto' (default='auto')
kernel str No Kernel function: 'gaussian', 'tophat', 'epanechnikov', 'exponential', 'linear', 'cosine' (default='gaussian')
metric str No Distance metric (default='euclidean')
atol float No Desired absolute tolerance of the result (default=0)
rtol float No Desired relative tolerance of the result (default=0)
breadth_first bool No Whether to use breadth-first or depth-first tree traversal (default=True)
leaf_size int No Leaf size for the tree (default=40)
metric_params dict or None No Additional parameters for the distance metric

Outputs

Name Type Description
score_samples(X) ndarray of shape (n_samples,) Log-likelihood of each sample under the model
sample(n_samples) ndarray of shape (n_samples, n_features) Randomly generated samples from the fitted density
n_features_in_ int Number of features seen during fit
tree_ BallTree or KDTree The fitted tree object used for queries
bandwidth_ float The actual bandwidth value used (after estimation if string was provided)

Usage Examples

Basic Usage

from sklearn.neighbors import KernelDensity
import numpy as np

# Generate sample data
rng = np.random.RandomState(42)
X = rng.randn(100, 2)

# Fit kernel density estimator
kde = KernelDensity(kernel="gaussian", bandwidth=0.5)
kde.fit(X)

# Score new samples
scores = kde.score_samples(X[:5])
print(scores)

# Generate new samples
samples = kde.sample(10, random_state=42)
print(samples.shape)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment