Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Datasets Bananas

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Datasets, Binary_Classification
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete dataset for binary classification provided by the River library.

Description

An artificial dataset where instances belongs to several clusters with a banana shape. There are two attributes that correspond to the x and y axis, respectively.

This dataset contains 5,300 samples with 2 features for binary classification tasks.

Usage

This dataset is useful for:

  • Testing binary classification algorithms
  • Evaluating clustering algorithms on non-linear cluster shapes
  • Benchmarking performance on synthetically generated data with known properties

Code Reference

Source Location

Signature

class Bananas(base.FileDataset):
    def __init__(self):
        super().__init__(filename="banana.zip", n_samples=5300, n_features=2, task=base.BINARY_CLF)

    def __iter__(self):
        return stream.iter_libsvm(self.path, target_type=lambda x: x == "1")

Import

from river import datasets
dataset = datasets.Bananas()

I/O Contract

Inputs

Name Type Required Description
(none) No parameters needed

Outputs

Name Type Description
iter() tuple(dict, bool) Yields (features_dict, target) pairs where target is boolean

Dataset Properties

Property Value
Number of samples 5,300
Number of features 2
Task Binary classification
Format LibSVM

Usage Examples

from river import datasets

dataset = datasets.Bananas()
for x, y in dataset:
    print(x, y)
    break

References

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment