Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Datasets Bikes

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Datasets, Regression
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete dataset for regression provided by the River library.

Description

Bike sharing station information from the city of Toulouse. The goal is to predict the number of bikes in 5 different bike stations from the city of Toulouse.

This dataset contains 182,470 samples with 8 features for regression tasks.

Usage

This dataset is useful for:

  • Time series forecasting and prediction tasks
  • Bike sharing demand prediction
  • Urban transportation analysis
  • Real-world regression problems with temporal patterns

Code Reference

Source Location

Signature

class Bikes(base.RemoteDataset):
    def __init__(self):
        super().__init__(
            url="https://maxhalford.github.io/files/datasets/toulouse_bikes.zip",
            size=13_125_015,
            n_samples=182_470,
            n_features=8,
            task=base.REG,
            filename="toulouse_bikes.csv",
        )

    def _iter(self):
        return stream.iter_csv(
            self.path,
            target="bikes",
            converters={
                "clouds": int,
                "humidity": int,
                "pressure": float,
                "temperature": float,
                "wind": float,
                "bikes": int,
            },
            parse_dates={"moment": "%Y-%m-%d %H:%M:%S"},
        )

Import

from river import datasets
dataset = datasets.Bikes()

I/O Contract

Inputs

Name Type Required Description
(none) No parameters needed

Outputs

Name Type Description
iter() tuple(dict, int) Yields (features_dict, target) pairs where features include weather conditions and target is bike count

Dataset Properties

Property Value
Number of samples 182,470
Number of features 8
Task Regression
Format CSV
Size 13,125,015 bytes

Features

The dataset includes the following features:

  • clouds: Cloud coverage (integer)
  • humidity: Humidity level (integer)
  • pressure: Atmospheric pressure (float)
  • temperature: Temperature (float)
  • wind: Wind speed (float)
  • moment: Timestamp (datetime)
  • bikes: Number of bikes available (target variable)

Usage Examples

from river import datasets

dataset = datasets.Bikes()
for x, y in dataset:
    print(x, y)
    break

References

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment