Overview
Multinomial Naive Bayes models feature occurrences using multinomial distributions, ideal for text classification with word counts or TF-IDF values.
Description
This implementation treats features as counts from a multinomial distribution. It maintains feature occurrence counts per class and total counts per class. The probability of a feature given a class is computed as P(f|c) = (count_f_in_c + alpha) / (total_c + alpha * n_features), where alpha provides Laplace smoothing. During prediction, joint log-likelihood sums weighted log probabilities: sum(count_f * log(P(f|c))). The model naturally handles sparse features and supports both single-instance and mini-batch updates.
Usage
Use Multinomial NB for text classification when word frequencies matter, making it more suitable than Bernoulli NB for longer documents. Works with count vectors or TF-IDF features (positive values required). Excellent baseline for document classification, spam detection, and sentiment analysis. Supports efficient mini-batch processing with sparse matrices for large-scale text data.
Code Reference
Source Location
Signature
class MultinomialNB(base.BaseNB):
def __init__(self, alpha=1.0):
self.alpha = alpha
self.class_counts = collections.Counter()
self.feature_counts = collections.defaultdict(collections.Counter)
self.class_totals = collections.Counter()
Import
from river import naive_bayes
I/O Contract
Parameters
| Parameter |
Type |
Default |
Description
|
| alpha |
float |
1.0 |
Additive Laplace/Lidstone smoothing
|
Attributes
| Attribute |
Type |
Description
|
| class_counts |
Counter |
Instance counts per class
|
| feature_counts |
defaultdict |
Feature occurrence counts per class
|
| class_totals |
Counter |
Total feature counts per class
|
Properties
| Property |
Type |
Description
|
| classes_ |
list |
List of known classes
|
| n_terms |
int |
Total number of unique features
|
Input/Output
| Method |
Input |
Output
|
| learn_one |
x: dict, y: Any |
None
|
| learn_many |
X: DataFrame, y: Series |
None
|
| predict_proba_one |
x: dict |
dict
|
| predict_proba_many |
X: DataFrame |
DataFrame
|
Usage Examples
import pandas as pd
from river import compose
from river import feature_extraction
from river import naive_bayes
docs = [
("Chinese Beijing Chinese", "yes"),
("Chinese Chinese Shanghai", "yes"),
("Chinese Macao", "maybe"),
("Tokyo Japan Chinese", "no")
]
# Single instance learning
model = compose.Pipeline(
("tokenize", feature_extraction.BagOfWords(lowercase=False)),
("nb", naive_bayes.MultinomialNB(alpha=1))
)
for sentence, label in docs:
model.learn_one(sentence, label)
model["nb"].p_class("yes")
# 0.5
model["nb"].p_class("no")
# 0.25
model["nb"].p_class("maybe")
# 0.25
model.predict_proba_one("test")
# {'yes': 0.413, 'maybe': 0.310, 'no': 0.275}
model.predict_one("test")
# 'yes'
# Mini-batch learning
X = pd.Series([
"Chinese Beijing Chinese",
"Chinese Chinese Shanghai",
"Chinese Macao",
"Tokyo Japan Chinese"
])
y = pd.Series(["yes", "yes", "maybe", "no"])
model = compose.Pipeline(
("tokenize", feature_extraction.BagOfWords(lowercase=False)),
("nb", naive_bayes.MultinomialNB(alpha=1))
)
model.learn_many(X, y)
unseen = pd.Series(["Taiwanese Taipei", "Chinese Shanghai"])
model.predict_proba_many(unseen)
# maybe no yes
# 0 0.373272 0.294931 0.331797
# 1 0.160396 0.126733 0.712871
model.predict_many(unseen)
# 0 maybe
# 1 yes
# dtype: object
Related Pages