Implementation:Scikit learn Scikit learn FetchLfw

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Data Loading, Computer Vision
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete tool for fetching the Labeled Faces in the Wild (LFW) face recognition dataset provided by scikit-learn.

Description

This module provides two main functions for loading the LFW dataset: fetch_lfw_people for face recognition tasks (person identification) and fetch_lfw_pairs for face verification tasks (same/different person classification). The dataset consists of JPEG images of famous people collected from the internet. The module supports loading both original and funneled (aligned) versions, with configurable image resizing and minimum face count filtering.

Usage

Use fetch_lfw_people for training face recognition classifiers and fetch_lfw_pairs for face verification experiments. The dataset is commonly used as a benchmark for face recognition and dimensionality reduction algorithms.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/datasets/_lfw.py

Signature

def fetch_lfw_people(
    *,
    data_home=None,
    funneled=True,
    resize=0.5,
    min_faces_per_person=0,
    color=False,
    slice_=(slice(70, 195), slice(78, 172)),
    download_if_missing=True,
    return_X_y=False,
    n_retries=3,
    delay=1.0,
)

def fetch_lfw_pairs(
    *,
    subset="train",
    data_home=None,
    funneled=True,
    resize=0.5,
    color=False,
    slice_=(slice(70, 195), slice(78, 172)),
    download_if_missing=True,
    n_retries=3,
    delay=1.0,
)

Import

from sklearn.datasets import fetch_lfw_people, fetch_lfw_pairs

I/O Contract

Inputs

Name	Type	Required	Description
data_home	str or PathLike or None	No	Custom directory for data storage
funneled	bool	No	Whether to use funneled (aligned) images (default: True)
resize	float	No	Ratio for resizing face images (default: 0.5)
min_faces_per_person	int	No	Minimum number of faces per person to include (default: 0)
color	bool	No	Whether to load color images (default: False)
slice_	tuple of slices	No	Crop region for face extraction
return_X_y	bool	No	If True, return (data, target) tuple (default: False)
subset	str	No	For pairs: 'train', 'test', or '10_folds' (default: 'train')

Outputs

Name	Type	Description
data	Bunch	Dictionary-like with data (image array), target (person IDs), target_names, images, DESCR
(X, y)	tuple	Returned when return_X_y=True; flattened images and person labels

Usage Examples

Basic Usage

from sklearn.datasets import fetch_lfw_people

# Load faces with at least 70 images per person
lfw = fetch_lfw_people(min_faces_per_person=70, resize=0.4)
print("Shape:", lfw.data.shape)
print("Classes:", lfw.target_names)
print("Image shape:", lfw.images.shape)

Related Pages

Principle:Scikit_learn_Scikit_learn_Dataset_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment