Implementation:Online ml River Stats SEM
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Statistics |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
SEM computes the running standard error of the mean using Welford's algorithm.
Description
This statistic calculates the standard error of the mean, which measures the precision of the sample mean as an estimate of the population mean. It is computed as the standard deviation divided by the square root of the sample size. The implementation extends the Var class and uses Welford's algorithm for numerical stability. The ddof parameter controls the degrees of freedom correction.
Usage
Use SEM when you need to quantify the uncertainty in the estimated mean of streaming data. Common applications include confidence interval construction, hypothesis testing, quality control, A/B testing analysis, and any scenario where you need to assess how precisely you have estimated the true mean.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/stats/sem.py
Signature
class SEM(var.Var):
# Inherits from Var class
# Constructor uses parent Var.__init__(ddof)
# Overrides get() method to compute SEM
pass
Import
from river import stats
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| x | numbers.Number | Yes | Value to update the statistic with |
| ddof | int | Yes (init) | Delta Degrees of Freedom (default: 1) |
Outputs
| Name | Type | Description |
|---|---|---|
| get() | float or None | Current standard error of the mean (None if n=0) |
Usage Examples
from river import stats
# Basic standard error of mean
X = [3, 5, 4, 7, 10, 12]
sem = stats.SEM()
for x in X:
sem.update(x)
print(f"Value: {x}, SEM: {sem.get():.6f}")
# Output:
# Value: 3, SEM: 0.000000
# Value: 5, SEM: 1.000000
# Value: 4, SEM: 0.577350
# Value: 7, SEM: 0.853912
# Value: 10, SEM: 1.240967
# Value: 12, SEM: 1.447219
# Rolling SEM
from river import utils
X = [1, 4, 2, -4, -8, 0]
rolling_sem = utils.Rolling(stats.SEM(ddof=1), window_size=3)
for x in X:
rolling_sem.update(x)
print(f"Value: {x}, Rolling SEM: {rolling_sem.get():.6f}")
# Output:
# Value: 1, SEM: 0.000000
# Value: 4, SEM: 1.500000
# Value: 2, SEM: 0.881917
# Value: -4, SEM: 2.403700
# Value: -8, SEM: 2.905932
# Value: 0, SEM: 2.309401
# Use case: confidence interval estimation
mean = stats.Mean()
sem_stat = stats.SEM()
data = [23, 25, 27, 22, 26, 24, 28, 25, 23, 26]
for x in data:
mean.update(x)
sem_stat.update(x)
# 95% confidence interval (approximately mean ± 2*SEM)
mean_val = mean.get()
sem_val = sem_stat.get()
ci_lower = mean_val - 2 * sem_val
ci_upper = mean_val + 2 * sem_val
print(f"Mean: {mean_val:.2f}")
print(f"SEM: {sem_val:.2f}")
print(f"95% CI: [{ci_lower:.2f}, {ci_upper:.2f}]")
# Comparing two groups
group_a_sem = stats.SEM()
group_b_sem = stats.SEM()
for x in [10, 12, 11, 13, 9]:
group_a_sem.update(x)
for x in [15, 17, 16, 18, 14]:
group_b_sem.update(x)
print(f"Group A SEM: {group_a_sem.get():.3f}")
print(f"Group B SEM: {group_b_sem.get():.3f}")