Implementation:Online ml River Compose TransformerProduct

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Feature_Engineering, Interaction_Terms, Data_Transformation
Last Updated	2026-02-08 16:00 GMT

Overview

TransformerProduct computes interaction terms between the outputs of multiple transformers by creating cross-products of their features.

Description

TransformerProduct extends TransformerUnion to create multiplicative interaction features between different groups of transformed features. Instead of just merging transformer outputs like TransformerUnion, it computes the Cartesian product of features from each transformer and multiplies their values.

For each combination of features (one from each transformer), the product creates a new feature with a name formed by joining the original feature names with asterisks (e.g., "a*x"). The value is the product of the corresponding feature values. This is useful for capturing non-linear interactions between feature groups.

The implementation includes optimizations for mini-batch processing, with special fast-paths for sparse data types including sparse[uint8] (for binary features) and general sparse arrays. It handles various combinations of sparse and dense data efficiently using pandas sparse dtypes.

Usage

Use TransformerProduct when you need to model interactions between different groups of features, such as user features × item features in recommendation systems, or when you want specific feature groups to interact without creating all possible polynomial combinations. It's more targeted than PolynomialExtender.

Code Reference

Source Location

Repository: Online_ml_River
File: river/compose/product.py

Signature

class TransformerProduct(union.TransformerUnion):
    def __init__(self, *transformers):
        ...

Import

from river import compose

I/O Contract

Input
Parameter	Type	Description
transformers	tuple	Variable number of transformers to create products from
x	dict	Feature dictionary for single-sample methods
X	DataFrame	Feature dataframe for mini-batch methods

Output
Method	Return Type	Description
transform_one(x)	dict	Dictionary of interaction features (products)
transform_many(X)	DataFrame	DataFrame of interaction features (products)

Key Methods
Method	Parameters	Description
transform_one(x)	x: dict	Computes all feature products
transform_many(X)	X: DataFrame	Computes all feature products (optimized for sparse)
__mul__(other)	other: transformer	Adds transformer to product (* operator)
__add__(other)	other: transformer	Creates union with other (+ operator)

Usage Examples

from pprint import pprint
from river.compose import Select, TransformerProduct

# Define feature groups
x = dict(
    a=0, b=1,  # group 1
    x=2, y=3   # group 2
)

# Create interaction terms between groups
product = TransformerProduct(
    Select('a', 'b'),
    Select('x', 'y')
)

pprint(product.transform_one(x))
# {'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}

# Shorthand using * operator
product = Select('a', 'b') * Select('x', 'y')
pprint(product.transform_one(x))
# {'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}

# Include original features using + operator
group_1 = Select('a', 'b')
group_2 = Select('x', 'y')
product = group_1 + group_2 + group_1 * group_2

pprint(product.transform_one(x))
# {'a': 0, 'a*x': 0, 'a*y': 0, 'b': 1, 'b*x': 2, 'b*y': 3, 'x': 2, 'y': 3}

# Real-world example: User-Item interactions
user_features = Select('user_age', 'user_gender')
item_features = Select('item_price', 'item_category')

# Model interactions between user and item features
model_features = (
    user_features +                    # User features alone
    item_features +                    # Item features alone
    user_features * item_features      # User × Item interactions
)

sample = {
    'user_age': 25,
    'user_gender': 1,
    'item_price': 50,
    'item_category': 3
}

pprint(model_features.transform_one(sample))
# {
#   'user_age': 25,
#   'user_gender': 1,
#   'item_price': 50,
#   'item_category': 3,
#   'user_age*item_price': 1250,
#   'user_age*item_category': 75,
#   'user_gender*item_price': 50,
#   'user_gender*item_category': 3
# }

# Use in complete pipeline
from river import linear_model
from river import preprocessing

pipeline = (
    (user_features + item_features + user_features * item_features) |
    preprocessing.StandardScaler() |
    linear_model.LinearRegression()
)

# Three-way products
group_3 = Select('z')
three_way = group_1 * group_2 * group_3

x_extended = {**x, 'z': 5}
pprint(three_way.transform_one(x_extended))
# {'a*x*z': 0, 'a*y*z': 0, 'b*x*z': 10, 'b*y*z': 15}

# Mini-batch processing
import pandas as pd

X = pd.DataFrame([
    {'a': 1, 'b': 2, 'x': 3, 'y': 4},
    {'a': 2, 'b': 3, 'x': 4, 'y': 5},
])

product = Select('a', 'b') * Select('x', 'y')
print(product.transform_many(X))
#    a*x  a*y  b*x  b*y
# 0    3    4    6    8
# 1    8   10   12   15

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment