Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn KNNImputer

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Missing Data, Imputation
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete implementation of k-Nearest Neighbors imputation for missing values provided by scikit-learn.

Description

The KNNImputer class fills in missing values using the mean (or weighted mean) of the k-nearest neighbors found in the training set. Distance between samples is computed only on features that neither sample has missing. It supports uniform and distance-based weighting, and uses the nan_euclidean distance metric by default which handles NaN values natively.

Usage

Use KNNImputer when you want to impute missing values based on the similarity of samples, leveraging the local structure of the data. It is particularly effective when similar samples tend to have similar feature values.

Code Reference

Source Location

Signature

class KNNImputer(_BaseImputer):
    def __init__(
        self,
        *,
        missing_values=np.nan,
        n_neighbors=5,
        weights="uniform",
        metric="nan_euclidean",
        copy=True,
        add_indicator=False,
        keep_empty_features=False,
    ):
        ...

    def fit(self, X, y=None):
        ...

    def transform(self, X):
        ...

Import

from sklearn.impute import KNNImputer

I/O Contract

Inputs

Name Type Required Description
X array-like of shape (n_samples, n_features) Yes Data with missing values to impute
n_neighbors int No Number of nearest neighbors to use (default: 5)
weights str or callable No Weight function: "uniform" or "distance"
metric str or callable No Distance metric (default: "nan_euclidean")
missing_values int, float, str, or np.nan No Placeholder for missing values
copy bool No Whether to create a copy of the input data

Outputs

Name Type Description
X_imputed ndarray of shape (n_samples, n_features) Data with missing values imputed using KNN

Usage Examples

Basic Usage

import numpy as np
from sklearn.impute import KNNImputer

X = np.array([[1, 2, np.nan], [3, 4, 3], [np.nan, 6, 5], [8, 8, 7]])
imputer = KNNImputer(n_neighbors=2, weights="uniform")
X_imputed = imputer.fit_transform(X)
print(X_imputed)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment