Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Stream Iter Libsvm

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Data_Streaming, File_Formats, Sparse_Data
Last Updated 2026-02-08 16:00 GMT

Overview

Iterates over datasets stored in LIBSVM format, a popular sparse data representation used in machine learning.

Description

The iter_libsvm function reads files in LIBSVM format, where each line represents a sample with a target value followed by sparse feature-value pairs. This format is widely used for storing large sparse datasets, especially in text classification and recommender systems. Only numerical features are supported, but feature names are treated as strings.

Usage

Use this when working with datasets in LIBSVM/SVMlight format, particularly large sparse datasets from text mining, recommendation systems, or datasets from LIBSVM's repository. The sparse format saves memory and disk space when most feature values are zero.

Code Reference

Source Location

Signature

def iter_libsvm(
    filepath_or_buffer: str,
    target_type=float,
    compression="infer"
) -> base.typing.Stream:
    ...

Import

from river import stream

I/O Contract

Parameter Type Description
filepath_or_buffer str or buffer Path to file or buffer with read method
target_type type Type to cast target values (default: float)
compression str Decompression method ('infer', 'gz', 'zip')

Returns:

Type Description
Iterator[(dict, Any)] Stream of (sparse features dict, target) tuples

Usage Examples

import io
from river import stream

# Create LIBSVM format data
# Format: target feature:value feature:value ...
libsvm_data = io.StringIO('''+1 x:-134.26 y:0.2563
1 x:-12 z:0.3
-1 y:.25
''')

# Iterate with integer targets
for x, y in stream.iter_libsvm(libsvm_data, target_type=int):
    print(f"Target: {y}, Features: {x}")
# Output:
# Target: 1, Features: {'x': -134.26, 'y': 0.2563}
# Target: 1, Features: {'x': -12.0, 'z': 0.3}
# Target: -1, Features: {'y': 0.25}

# Example with file
with open('sparse_data.libsvm', 'w') as f:
    f.write("+1 1:0.5 3:1.2 5:0.8\n")
    f.write("-1 2:0.3 4:0.7\n")
    f.write("+1 1:0.9 2:0.1 3:0.4\n")

# Read and process
for features, label in stream.iter_libsvm('sparse_data.libsvm', target_type=int):
    print(f"Label: {label:+d}")
    print(f"Active features: {list(features.keys())}")
    print(f"Values: {features}")
    print()

# Cleanup
import os
os.remove('sparse_data.libsvm')

# The format supports comments (lines starting with #)
# and empty lines which are ignored

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment