Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn FetchCaliforniaHousing

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Data Loading
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for fetching and loading the California housing dataset for regression tasks, provided by scikit-learn.

Description

The fetch_california_housing function downloads and caches the California housing dataset, which contains 20,640 observations on 9 variables including median house value as the target and features such as average income, housing average age, average rooms, population, latitude, and longitude. The dataset originates from Pace and Barry (1997).

Usage

Use this function when you need a real-world regression dataset for benchmarking or prototyping regression models. It is commonly used as a standard regression benchmark in scikit-learn tutorials and examples.

Code Reference

Source Location

Signature

@validate_params(...)
def fetch_california_housing(
    *,
    data_home=None,
    download_if_missing=True,
    return_X_y=False,
    as_frame=False,
    n_retries=3,
    delay=1.0,
):

Import

from sklearn.datasets import fetch_california_housing

I/O Contract

Inputs

Name Type Required Description
data_home str, PathLike or None No Custom directory for caching (default None uses sklearn data home)
download_if_missing bool No If True, download data if not cached (default True)
return_X_y bool No If True, return (data, target) instead of Bunch (default False)
as_frame bool No If True, return data as pandas DataFrame (default False)
n_retries int No Number of download retries (default 3)
delay float No Delay between retries in seconds (default 1.0)

Outputs

Name Type Description
dataset Bunch Dictionary-like object with data, target, feature_names, DESCR, and frame
(X, y) tuple of ndarray Feature matrix and target array when return_X_y=True

Usage Examples

Basic Usage

from sklearn.datasets import fetch_california_housing

data = fetch_california_housing()
print(data.data.shape)    # (20640, 8)
print(data.target.shape)  # (20640,)
print(data.feature_names)

# As X, y tuple
X, y = fetch_california_housing(return_X_y=True)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment