Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Scikit learn contrib Imbalanced learn fetch datasets

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Benchmarking, Imbalanced_Learning
Last Updated 2026-02-09 03:00 GMT

Overview

Concrete tool for downloading and caching benchmark imbalanced datasets from Zenodo provided by the imbalanced-learn library.

Description

The fetch_datasets function downloads a collection of 27 benchmark imbalanced datasets from Zenodo. Results are cached locally. Each dataset is returned as a Bunch object with .data, .target, and .DESCR attributes. Datasets can be filtered by name or ID.

Usage

Import this function when you need standardized imbalanced datasets for benchmarking resampling or classification methods.

Code Reference

Source Location

Signature

def fetch_datasets(
    *,
    data_home=None,
    filter_data=None,
    download_if_missing=True,
    random_state=None,
    shuffle=False,
    verbose=False,
):
    """
    Args:
        data_home: str or None - Cache directory (default: ~/scikit_learn_data).
        filter_data: tuple of str/int or None - Dataset names or IDs to load.
        download_if_missing: bool - Auto-download if not cached (default: True).
        random_state: int, RandomState, or None - Shuffle seed.
        shuffle: bool - Shuffle data (default: False).
        verbose: bool - Print fetch info (default: False).
    Returns:
        OrderedDict of Bunch objects with .data, .target, .DESCR.
    """

Import

from imblearn.datasets import fetch_datasets

I/O Contract

Inputs

Name Type Required Description
data_home str or None No Cache directory path
filter_data tuple of str/int or None No Dataset names or IDs to load
download_if_missing bool No Download if not cached (default: True)

Outputs

Name Type Description
datasets OrderedDict of Bunch Keyed by dataset name; each Bunch has .data (ndarray), .target (ndarray), .DESCR (str)

Usage Examples

from imblearn.datasets import fetch_datasets

# Load specific datasets
datasets = fetch_datasets(filter_data=("ecoli", "satimage"))
for name, ds in datasets.items():
    print(f"{name}: {ds.data.shape}, imbalance ratio: ...")

# Load all datasets
all_datasets = fetch_datasets()

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment