Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Datasets SolarFlare

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Datasets, Multi_Output_Regression
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete dataset for multi-output regression provided by the River library.

Description

Solar flare multi-output regression dataset. The goal is to predict three different types of solar flare activity (C-class, M-class, and X-class flares) simultaneously based on solar region characteristics.

This dataset contains 1,066 samples with 10 features and 3 output targets for multi-output regression tasks.

Usage

This dataset is useful for:

  • Multi-output regression problems
  • Solar activity prediction
  • Astronomical data analysis
  • Evaluating algorithms that predict multiple continuous outputs

Code Reference

Source Location

Signature

class SolarFlare(base.FileDataset):
    def __init__(self):
        super().__init__(
            n_samples=1_066,
            n_features=10,
            n_outputs=3,
            task=base.MO_REG,
            filename="solar-flare.csv.zip",
        )

    def __iter__(self):
        return stream.iter_csv(
            self.path,
            target=["c-class-flares", "m-class-flares", "x-class-flares"],
            converters={
                "zurich-class": str,
                "largest-spot-size": str,
                "spot-distribution": str,
                "activity": int,
                "evolution": int,
                "previous-24h-flare-activity": int,
                "hist-complex": int,
                "hist-complex-this-pass": int,
                "area": int,
                "largest-spot-area": int,
                "c-class-flares": int,
                "m-class-flares": int,
                "x-class-flares": int,
            },
        )

Import

from river import datasets
dataset = datasets.SolarFlare()

I/O Contract

Inputs

Name Type Required Description
(none) No parameters needed

Outputs

Name Type Description
iter() tuple(dict, dict) Yields (features_dict, targets_dict) where targets contain 3 integer counts

Dataset Properties

Property Value
Number of samples 1,066
Number of features 10
Number of outputs 3
Task Multi-output regression
Format CSV (compressed)

Features

The dataset includes 10 features describing solar regions:

  • zurich-class: Modified Zurich class (string)
  • largest-spot-size: Largest spot size (string)
  • spot-distribution: Spot distribution (string)
  • activity: Activity level (integer)
  • evolution: Evolution over time (integer)
  • previous-24h-flare-activity: Flare activity in past 24 hours (integer)
  • hist-complex: Historical complexity (integer)
  • hist-complex-this-pass: Historical complexity for this pass (integer)
  • area: Area of solar region (integer)
  • largest-spot-area: Area of largest spot (integer)

Target Outputs

Three simultaneous regression targets:

  • c-class-flares: Number of C-class flares (integer)
  • m-class-flares: Number of M-class flares (integer)
  • x-class-flares: Number of X-class flares (integer)

Usage Examples

from river import datasets

dataset = datasets.SolarFlare()
for x, y in dataset:
    print(f"Features: {x}")
    print(f"Targets: {y}")
    break

References

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment