Principle:Online ml River Active Learning Classification
| Knowledge Sources | Machine Learning Active Learning |
|---|---|
| Domains | Online_Learning Active_Learning Classification |
| Last Updated | 2026-02-08 18:00 GMT |
Overview
Active learning for classification is a learning paradigm in which the model selectively queries an oracle (e.g., a human annotator) for labels on the most informative instances, rather than passively receiving labels for all observations. The goal is to achieve high classification accuracy with as few labeled examples as possible.
Description
In many real-world streaming scenarios, acquiring labels is expensive or time-consuming. Active learning addresses this by allowing the learner to decide which instances are worth labeling. The model observes each incoming instance, computes an informativeness score, and only requests the true label when that score exceeds a threshold or satisfies a selection criterion.
Common query strategies include:
- Uncertainty sampling: Query instances where the model is least confident in its prediction.
- Entropy-based sampling: Query instances where the predicted class distribution has maximum entropy.
- Query-by-committee: Maintain multiple models and query instances where they disagree most.
- Random sampling: Query a fixed fraction of instances uniformly at random (baseline).
In the streaming setting, active learning is particularly important because the data arrives continuously and labeling every instance is often impractical. The learner must make irrevocable decisions about whether to query each instance as it arrives.
Usage
Use active learning for classification when:
- Labeling cost is high relative to data acquisition cost.
- The data stream is too fast or voluminous for exhaustive labeling.
- You want to maximize classification performance under a fixed labeling budget.
- Certain regions of the feature space are more informative than others.
Theoretical Basis
Uncertainty sampling selects the instance whose predicted class distribution has the highest uncertainty:
x* = argmax_x U(P(y | x))
Where is an uncertainty measure. Common choices:
Least confidence:
U_lc(P) = 1 - max_y P(y | x)
Entropy:
U_entropy(P) = - sum_y P(y | x) * log P(y | x)
For binary classification, entropy is maximized at and minimized at the extremes. The entropy criterion generalizes naturally to multi-class problems.
Budget management: In the streaming setting, the learner typically operates under a budget constraint that limits the fraction of instances for which labels are requested. The selection threshold may be adapted dynamically to stay within budget.
Theoretical guarantee: Under certain distributional assumptions, active learning can achieve exponentially faster convergence in classification error compared to passive learning, requiring labels instead of for error rate .