Implementation:Online ml River Datasets Bananas
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Datasets, Binary_Classification |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Concrete dataset for binary classification provided by the River library.
Description
An artificial dataset where instances belongs to several clusters with a banana shape. There are two attributes that correspond to the x and y axis, respectively.
This dataset contains 5,300 samples with 2 features for binary classification tasks.
Usage
This dataset is useful for:
- Testing binary classification algorithms
- Evaluating clustering algorithms on non-linear cluster shapes
- Benchmarking performance on synthetically generated data with known properties
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/datasets/bananas.py
Signature
class Bananas(base.FileDataset):
def __init__(self):
super().__init__(filename="banana.zip", n_samples=5300, n_features=2, task=base.BINARY_CLF)
def __iter__(self):
return stream.iter_libsvm(self.path, target_type=lambda x: x == "1")
Import
from river import datasets
dataset = datasets.Bananas()
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (none) | — | — | No parameters needed |
Outputs
| Name | Type | Description |
|---|---|---|
| iter() | tuple(dict, bool) | Yields (features_dict, target) pairs where target is boolean |
Dataset Properties
| Property | Value |
|---|---|
| Number of samples | 5,300 |
| Number of features | 2 |
| Task | Binary classification |
| Format | LibSVM |
Usage Examples
from river import datasets
dataset = datasets.Bananas()
for x, y in dataset:
print(x, y)
break
References
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment