Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:DistrictDataLabs Yellowbrick Dataset Download

From Leeroopedia


Knowledge Sources
Domains Datasets, Utilities
Last Updated 2026-02-08 05:00 GMT

Overview

Utility function for downloading, verifying, and extracting Yellowbrick example datasets from a hosted store.

Description

The download_data function downloads a zipped dataset from a URL, verifies its SHA256 signature against the expected hash, and extracts the archive to the data home directory. It supports incremental downloading and skips re-download if the dataset already exists (unless replace=True).

Usage

Import this function when building custom dataset loading pipelines or when you need programmatic control over dataset downloads. Most users will use download_all() from yellowbrick.download instead.

Code Reference

Source Location

Signature

def download_data(url, signature, data_home=None, replace=False, extract=True):
    """Downloads zipped dataset, verifies signature, and extracts archive."""

Import

from yellowbrick.datasets.download import download_data

I/O Contract

Inputs

Name Type Required Description
url str Yes URL to download dataset zip from
signature str Yes Expected SHA256 hash of the zip
data_home str No Local directory for datasets
replace bool No Re-download even if exists (default: False)
extract bool No Extract the zip (default: True)

Outputs

Name Type Description
(side effect) Files Downloaded and extracted dataset on disk

Usage Examples

from yellowbrick.datasets.download import download_data

download_data(
    url="https://s3.amazonaws.com/ddl-data-lake/yellowbrick/v1.0/mushroom.zip",
    signature="abc123...",
    data_home="~/.yellowbrick",
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment