Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Datasets Music

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Datasets, Multi_Output_Classification, Multi_Label
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete dataset for multi-output binary classification (multi-label) provided by the River library.

Description

Multi-label music mood prediction. The goal is to predict to which kinds of moods a song pertains to. Each song can belong to multiple mood categories simultaneously.

This dataset contains 593 samples with 72 features and 6 binary output labels for multi-label classification tasks.

Usage

This dataset is useful for:

  • Multi-label classification tasks
  • Music information retrieval
  • Emotion/mood prediction from audio features
  • Testing algorithms that handle multiple simultaneous binary outputs

Code Reference

Source Location

Signature

class Music(base.RemoteDataset):
    def __init__(self):
        super().__init__(
            task=base.MO_BINARY_CLF,
            n_samples=593,
            n_features=72,
            n_outputs=6,
            url="https://raw.githubusercontent.com/scikit-multiflow/streaming-datasets/master/music.csv",
            size=378_980,
            unpack=False,
        )

    def _iter(self):
        return stream.iter_csv(
            self.path,
            target=[
                "amazed-suprised",
                "happy-pleased",
                "relaxing-clam",
                "quiet-still",
                "sad-lonely",
                "angry-aggresive",
            ],
            converters={
                "amazed-suprised": lambda x: x == "1",
                "happy-pleased": lambda x: x == "1",
                "relaxing-clam": lambda x: x == "1",
                "quiet-still": lambda x: x == "1",
                "sad-lonely": lambda x: x == "1",
                "angry-aggresive": lambda x: x == "1",
                # ... MFCC and other audio features ...
            },
        )

Import

from river import datasets
dataset = datasets.Music()

I/O Contract

Inputs

Name Type Required Description
(none) No parameters needed

Outputs

Name Type Description
iter() tuple(dict, dict) Yields (features_dict, labels_dict) where labels are 6 boolean values

Dataset Properties

Property Value
Number of samples 593
Number of features 72
Number of outputs 6
Task Multi-output binary classification (multi-label)
Format CSV
Size 378,980 bytes

Features

The dataset includes 72 audio features:

  • Mean and Standard Deviation of MFCC coefficients (Mel-Frequency Cepstral Coefficients 0-12)
  • Spectral features: Centroid, Rolloff, Flux
  • Beat histogram features: BH_LowPeakAmp, BH_LowPeakBPM, BH_HighPeakAmp, BH_HighPeakBPM, BH_HighLowRatio
  • Summary features: BHSUM1, BHSUM2, BHSUM3

Target Labels

Six mood categories (each is a binary label):

  • amazed-suprised: Excited, surprised emotional state
  • happy-pleased: Positive, joyful emotional state
  • relaxing-clam: Calm, peaceful emotional state
  • quiet-still: Tranquil, silent emotional state
  • sad-lonely: Melancholic emotional state
  • angry-aggresive: Intense, aggressive emotional state

Usage Examples

from river import datasets

dataset = datasets.Music()
for x, y in dataset:
    print(f"Features: {list(x.keys())[:5]}...")  # Show first 5 feature names
    print(f"Labels: {y}")
    break

References

  • Read, J., Reutemann, P., Pfahringer, B. and Holmes, G., 2016. MEKA: a multi-label/multi-target extension to WEKA. The Journal of Machine Learning Research, 17(1), pp.667-671. [1]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment