Implementation:Online ml River Compose TransformerProduct
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Feature_Engineering, Interaction_Terms, Data_Transformation |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
TransformerProduct computes interaction terms between the outputs of multiple transformers by creating cross-products of their features.
Description
TransformerProduct extends TransformerUnion to create multiplicative interaction features between different groups of transformed features. Instead of just merging transformer outputs like TransformerUnion, it computes the Cartesian product of features from each transformer and multiplies their values.
For each combination of features (one from each transformer), the product creates a new feature with a name formed by joining the original feature names with asterisks (e.g., "a*x"). The value is the product of the corresponding feature values. This is useful for capturing non-linear interactions between feature groups.
The implementation includes optimizations for mini-batch processing, with special fast-paths for sparse data types including sparse[uint8] (for binary features) and general sparse arrays. It handles various combinations of sparse and dense data efficiently using pandas sparse dtypes.
Usage
Use TransformerProduct when you need to model interactions between different groups of features, such as user features × item features in recommendation systems, or when you want specific feature groups to interact without creating all possible polynomial combinations. It's more targeted than PolynomialExtender.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/compose/product.py
Signature
class TransformerProduct(union.TransformerUnion):
def __init__(self, *transformers):
...
Import
from river import compose
I/O Contract
| Parameter | Type | Description |
|---|---|---|
| transformers | tuple | Variable number of transformers to create products from |
| x | dict | Feature dictionary for single-sample methods |
| X | DataFrame | Feature dataframe for mini-batch methods |
| Method | Return Type | Description |
|---|---|---|
| transform_one(x) | dict | Dictionary of interaction features (products) |
| transform_many(X) | DataFrame | DataFrame of interaction features (products) |
| Method | Parameters | Description |
|---|---|---|
| transform_one(x) | x: dict | Computes all feature products |
| transform_many(X) | X: DataFrame | Computes all feature products (optimized for sparse) |
| __mul__(other) | other: transformer | Adds transformer to product (* operator) |
| __add__(other) | other: transformer | Creates union with other (+ operator) |
Usage Examples
from pprint import pprint
from river.compose import Select, TransformerProduct
# Define feature groups
x = dict(
a=0, b=1, # group 1
x=2, y=3 # group 2
)
# Create interaction terms between groups
product = TransformerProduct(
Select('a', 'b'),
Select('x', 'y')
)
pprint(product.transform_one(x))
# {'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}
# Shorthand using * operator
product = Select('a', 'b') * Select('x', 'y')
pprint(product.transform_one(x))
# {'a*x': 0, 'a*y': 0, 'b*x': 2, 'b*y': 3}
# Include original features using + operator
group_1 = Select('a', 'b')
group_2 = Select('x', 'y')
product = group_1 + group_2 + group_1 * group_2
pprint(product.transform_one(x))
# {'a': 0, 'a*x': 0, 'a*y': 0, 'b': 1, 'b*x': 2, 'b*y': 3, 'x': 2, 'y': 3}
# Real-world example: User-Item interactions
user_features = Select('user_age', 'user_gender')
item_features = Select('item_price', 'item_category')
# Model interactions between user and item features
model_features = (
user_features + # User features alone
item_features + # Item features alone
user_features * item_features # User × Item interactions
)
sample = {
'user_age': 25,
'user_gender': 1,
'item_price': 50,
'item_category': 3
}
pprint(model_features.transform_one(sample))
# {
# 'user_age': 25,
# 'user_gender': 1,
# 'item_price': 50,
# 'item_category': 3,
# 'user_age*item_price': 1250,
# 'user_age*item_category': 75,
# 'user_gender*item_price': 50,
# 'user_gender*item_category': 3
# }
# Use in complete pipeline
from river import linear_model
from river import preprocessing
pipeline = (
(user_features + item_features + user_features * item_features) |
preprocessing.StandardScaler() |
linear_model.LinearRegression()
)
# Three-way products
group_3 = Select('z')
three_way = group_1 * group_2 * group_3
x_extended = {**x, 'z': 5}
pprint(three_way.transform_one(x_extended))
# {'a*x*z': 0, 'a*y*z': 0, 'b*x*z': 10, 'b*y*z': 15}
# Mini-batch processing
import pandas as pd
X = pd.DataFrame([
{'a': 1, 'b': 2, 'x': 3, 'y': 4},
{'a': 2, 'b': 3, 'x': 4, 'y': 5},
])
product = Select('a', 'b') * Select('x', 'y')
print(product.transform_many(X))
# a*x a*y b*x b*y
# 0 3 4 6 8
# 1 8 10 12 15