Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Metrics CohenKappa

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Evaluation_Metrics
Last Updated 2026-02-08 16:00 GMT

Overview

Cohen's Kappa coefficient measuring inter-annotator agreement adjusted for chance agreement.

Description

CohenKappa measures the level of agreement between two annotators (or between predictions and ground truth) on a classification problem, correcting for the possibility of agreement occurring by chance. The formula is κ = (po - pe) / (1 - pe), where po is observed agreement (accuracy) and pe is expected agreement by random chance. Values range from -1 (total disagreement) through 0 (random agreement) to 1 (perfect agreement).

Usage

Use Cohen's Kappa when you want to evaluate classification performance while accounting for the possibility of correct predictions occurring by random chance. It's particularly valuable when evaluating classifiers on imbalanced datasets or when comparing different annotation strategies, as it provides a more robust measure than raw accuracy by adjusting for chance agreement.

Code Reference

Source Location

Signature

class CohenKappa(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

Import

from river import metrics

I/O Contract

Method Parameters Returns Description
update y_true, y_pred, [w] None Updates metric with true and predicted labels
get - float Returns Cohen's Kappa coefficient (-1.0 to 1.0)

Usage Examples

from river import metrics

y_true = ['cat', 'ant', 'cat', 'cat', 'ant', 'bird']
y_pred = ['ant', 'ant', 'cat', 'cat', 'ant', 'cat']

metric = metrics.CohenKappa()

for yt, yp in zip(y_true, y_pred):
    metric.update(yt, yp)

print(metric)
# CohenKappa: 42.86%

# Interpretation:
# κ < 0:     Less than chance agreement (poor)
# κ = 0:     Chance agreement only
# 0 < κ < 0.2: Slight agreement
# 0.2 < κ < 0.4: Fair agreement
# 0.4 < κ < 0.6: Moderate agreement (our result)
# 0.6 < κ < 0.8: Substantial agreement
# 0.8 < κ < 1: Almost perfect agreement
# κ = 1:     Perfect agreement

# Compare with raw accuracy:
accuracy = metrics.Accuracy()
for yt, yp in zip(['cat', 'ant', 'cat', 'cat', 'ant', 'bird'],
                   ['ant', 'ant', 'cat', 'cat', 'ant', 'cat']):
    accuracy.update(yt, yp)

print(f"Accuracy: {accuracy.get():.2%}")
# Accuracy: 66.67%
# Cohen's Kappa (42.86%) is lower than accuracy because it
# adjusts for chance agreement

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment