Implementation:Scikit learn Scikit learn LocalOutlierFactor
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Anomaly Detection |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for unsupervised outlier detection using the Local Outlier Factor algorithm provided by scikit-learn.
Description
LocalOutlierFactor (LOF) is an unsupervised anomaly detection method that measures the local deviation of the density of a given sample with respect to its neighbors. It identifies samples that have a substantially lower density than their neighbors as outliers. LOF computes the local density using the distance to the k-nearest neighbors, making it effective for detecting outliers in datasets with varying density clusters.
Usage
Use LocalOutlierFactor when you need to identify outliers in an unlabeled dataset based on local density deviation. It is particularly useful when outliers are defined relative to their local neighborhood rather than by global statistics. Set novelty=True to use it for novelty detection on new unseen data.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/neighbors/_lof.py
Signature
class LocalOutlierFactor(KNeighborsMixin, OutlierMixin, NeighborsBase):
def __init__(
self,
n_neighbors=20,
*,
algorithm="auto",
leaf_size=30,
metric="minkowski",
p=2,
metric_params=None,
contamination="auto",
novelty=False,
n_jobs=None,
):
Import
from sklearn.neighbors import LocalOutlierFactor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_neighbors | int | No | Number of neighbors to use (default=20) |
| algorithm | str | No | Algorithm for computing nearest neighbors: 'auto', 'ball_tree', 'kd_tree', 'brute' |
| leaf_size | int | No | Leaf size for BallTree or KDTree (default=30) |
| metric | str or callable | No | Distance metric (default='minkowski') |
| p | float | No | Power parameter for Minkowski metric (default=2) |
| metric_params | dict or None | No | Additional keyword arguments for the metric function |
| contamination | float or str | No | Proportion of outliers in the dataset, or 'auto' (default='auto') |
| novelty | bool | No | Whether to use LOF for novelty detection (default=False) |
| n_jobs | int or None | No | Number of parallel jobs |
Outputs
| Name | Type | Description |
|---|---|---|
| negative_outlier_factor_ | ndarray of shape (n_samples,) | Opposite of the LOF scores for training samples (negative values; more negative means more abnormal) |
| n_neighbors_ | int | Actual number of neighbors used for kneighbors queries |
| offset_ | float | Offset used to obtain binary labels from the raw LOF scores |
| effective_metric_ | str | The effective metric used |
| effective_metric_params_ | dict | The effective metric parameters |
| n_features_in_ | int | Number of features seen during fit |
Usage Examples
Basic Usage
from sklearn.neighbors import LocalOutlierFactor
import numpy as np
# Generate data with outliers
rng = np.random.RandomState(42)
X_inliers = rng.randn(100, 2)
X_outliers = rng.uniform(low=-4, high=4, size=(10, 2))
X = np.vstack([X_inliers, X_outliers])
# Fit and predict
lof = LocalOutlierFactor(n_neighbors=20, contamination=0.1)
y_pred = lof.fit_predict(X)
print(f"Outliers detected: {(y_pred == -1).sum()}")
print(f"LOF scores: {lof.negative_outlier_factor_[:5]}")