Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River NaiveBayes MultinomialNB

From Leeroopedia
Revision as of 16:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Online_ml_River_NaiveBayes_MultinomialNB.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Online_Learning, Naive_Bayes, Text_Classification
Last Updated 2026-02-08 16:00 GMT

Overview

Multinomial Naive Bayes models feature occurrences using multinomial distributions, ideal for text classification with word counts or TF-IDF values.

Description

This implementation treats features as counts from a multinomial distribution. It maintains feature occurrence counts per class and total counts per class. The probability of a feature given a class is computed as P(f|c) = (count_f_in_c + alpha) / (total_c + alpha * n_features), where alpha provides Laplace smoothing. During prediction, joint log-likelihood sums weighted log probabilities: sum(count_f * log(P(f|c))). The model naturally handles sparse features and supports both single-instance and mini-batch updates.

Usage

Use Multinomial NB for text classification when word frequencies matter, making it more suitable than Bernoulli NB for longer documents. Works with count vectors or TF-IDF features (positive values required). Excellent baseline for document classification, spam detection, and sentiment analysis. Supports efficient mini-batch processing with sparse matrices for large-scale text data.

Code Reference

Source Location

Signature

class MultinomialNB(base.BaseNB):
    def __init__(self, alpha=1.0):
        self.alpha = alpha
        self.class_counts = collections.Counter()
        self.feature_counts = collections.defaultdict(collections.Counter)
        self.class_totals = collections.Counter()

Import

from river import naive_bayes

I/O Contract

Parameters

Parameter Type Default Description
alpha float 1.0 Additive Laplace/Lidstone smoothing

Attributes

Attribute Type Description
class_counts Counter Instance counts per class
feature_counts defaultdict Feature occurrence counts per class
class_totals Counter Total feature counts per class

Properties

Property Type Description
classes_ list List of known classes
n_terms int Total number of unique features

Input/Output

Method Input Output
learn_one x: dict, y: Any None
learn_many X: DataFrame, y: Series None
predict_proba_one x: dict dict
predict_proba_many X: DataFrame DataFrame

Usage Examples

import pandas as pd
from river import compose
from river import feature_extraction
from river import naive_bayes

docs = [
    ("Chinese Beijing Chinese", "yes"),
    ("Chinese Chinese Shanghai", "yes"),
    ("Chinese Macao", "maybe"),
    ("Tokyo Japan Chinese", "no")
]

# Single instance learning
model = compose.Pipeline(
    ("tokenize", feature_extraction.BagOfWords(lowercase=False)),
    ("nb", naive_bayes.MultinomialNB(alpha=1))
)

for sentence, label in docs:
    model.learn_one(sentence, label)

model["nb"].p_class("yes")
# 0.5

model["nb"].p_class("no")
# 0.25

model["nb"].p_class("maybe")
# 0.25

model.predict_proba_one("test")
# {'yes': 0.413, 'maybe': 0.310, 'no': 0.275}

model.predict_one("test")
# 'yes'

# Mini-batch learning
X = pd.Series([
   "Chinese Beijing Chinese",
   "Chinese Chinese Shanghai",
   "Chinese Macao",
   "Tokyo Japan Chinese"
])

y = pd.Series(["yes", "yes", "maybe", "no"])

model = compose.Pipeline(
    ("tokenize", feature_extraction.BagOfWords(lowercase=False)),
    ("nb", naive_bayes.MultinomialNB(alpha=1))
)

model.learn_many(X, y)

unseen = pd.Series(["Taiwanese Taipei", "Chinese Shanghai"])

model.predict_proba_many(unseen)
#       maybe        no       yes
# 0  0.373272  0.294931  0.331797
# 1  0.160396  0.126733  0.712871

model.predict_many(unseen)
# 0    maybe
# 1      yes
# dtype: object

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment