Implementation:Scikit learn Scikit learn FetchLfw
| Knowledge Sources | |
|---|---|
| Domains | Data Loading, Computer Vision |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for fetching the Labeled Faces in the Wild (LFW) face recognition dataset provided by scikit-learn.
Description
This module provides two main functions for loading the LFW dataset: fetch_lfw_people for face recognition tasks (person identification) and fetch_lfw_pairs for face verification tasks (same/different person classification). The dataset consists of JPEG images of famous people collected from the internet. The module supports loading both original and funneled (aligned) versions, with configurable image resizing and minimum face count filtering.
Usage
Use fetch_lfw_people for training face recognition classifiers and fetch_lfw_pairs for face verification experiments. The dataset is commonly used as a benchmark for face recognition and dimensionality reduction algorithms.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/datasets/_lfw.py
Signature
def fetch_lfw_people(
*,
data_home=None,
funneled=True,
resize=0.5,
min_faces_per_person=0,
color=False,
slice_=(slice(70, 195), slice(78, 172)),
download_if_missing=True,
return_X_y=False,
n_retries=3,
delay=1.0,
)
def fetch_lfw_pairs(
*,
subset="train",
data_home=None,
funneled=True,
resize=0.5,
color=False,
slice_=(slice(70, 195), slice(78, 172)),
download_if_missing=True,
n_retries=3,
delay=1.0,
)
Import
from sklearn.datasets import fetch_lfw_people, fetch_lfw_pairs
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data_home | str or PathLike or None | No | Custom directory for data storage |
| funneled | bool | No | Whether to use funneled (aligned) images (default: True) |
| resize | float | No | Ratio for resizing face images (default: 0.5) |
| min_faces_per_person | int | No | Minimum number of faces per person to include (default: 0) |
| color | bool | No | Whether to load color images (default: False) |
| slice_ | tuple of slices | No | Crop region for face extraction |
| return_X_y | bool | No | If True, return (data, target) tuple (default: False) |
| subset | str | No | For pairs: 'train', 'test', or '10_folds' (default: 'train') |
Outputs
| Name | Type | Description |
|---|---|---|
| data | Bunch | Dictionary-like with data (image array), target (person IDs), target_names, images, DESCR |
| (X, y) | tuple | Returned when return_X_y=True; flattened images and person labels |
Usage Examples
Basic Usage
from sklearn.datasets import fetch_lfw_people
# Load faces with at least 70 images per person
lfw = fetch_lfw_people(min_faces_per_person=70, resize=0.4)
print("Shape:", lfw.data.shape)
print("Classes:", lfw.target_names)
print("Image shape:", lfw.images.shape)