Implementation:Scikit learn Scikit learn KNNImputer

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Machine Learning, Missing Data, Imputation
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete implementation of k-Nearest Neighbors imputation for missing values provided by scikit-learn.

Description

The KNNImputer class fills in missing values using the mean (or weighted mean) of the k-nearest neighbors found in the training set. Distance between samples is computed only on features that neither sample has missing. It supports uniform and distance-based weighting, and uses the nan_euclidean distance metric by default which handles NaN values natively.

Usage

Use KNNImputer when you want to impute missing values based on the similarity of samples, leveraging the local structure of the data. It is particularly effective when similar samples tend to have similar feature values.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/impute/_knn.py

Signature

class KNNImputer(_BaseImputer):
    def __init__(
        self,
        *,
        missing_values=np.nan,
        n_neighbors=5,
        weights="uniform",
        metric="nan_euclidean",
        copy=True,
        add_indicator=False,
        keep_empty_features=False,
    ):
        ...

    def fit(self, X, y=None):
        ...

    def transform(self, X):
        ...

Import

from sklearn.impute import KNNImputer

I/O Contract

Inputs

Name	Type	Required	Description
X	array-like of shape (n_samples, n_features)	Yes	Data with missing values to impute
n_neighbors	int	No	Number of nearest neighbors to use (default: 5)
weights	str or callable	No	Weight function: "uniform" or "distance"
metric	str or callable	No	Distance metric (default: "nan_euclidean")
missing_values	int, float, str, or np.nan	No	Placeholder for missing values
copy	bool	No	Whether to create a copy of the input data

Outputs

Name	Type	Description
X_imputed	ndarray of shape (n_samples, n_features)	Data with missing values imputed using KNN

Usage Examples

Basic Usage

import numpy as np
from sklearn.impute import KNNImputer

X = np.array([[1, 2, np.nan], [3, 4, 3], [np.nan, 6, 5], [8, 8, 7]])
imputer = KNNImputer(n_neighbors=2, weights="uniform")
X_imputed = imputer.fit_transform(X)
print(X_imputed)

Related Pages

Principle:Scikit_learn_Scikit_learn_Missing_Data_Imputation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment