Implementation:DistrictDataLabs Yellowbrick FrequencyVisualizer
| Knowledge Sources | |
|---|---|
| Domains | NLP, Visualization |
| Last Updated | 2026-02-08 05:00 GMT |
Overview
Concrete tool for visualizing the frequency distribution of terms in a text corpus as a bar chart, provided by the Yellowbrick text module.
Description
The FrequencyVisualizer displays the most frequently occurring terms from a pre-vectorized document-term matrix as a bar chart. It supports both horizontal and vertical orientations and allows limiting the number of displayed terms. The input must be a document-term matrix (e.g., from CountVectorizer or TfidfVectorizer) along with the feature names.
Usage
Import this visualizer when exploring the most common terms in a corpus after vectorization. It works with the output of scikit-learn text vectorizers.
Code Reference
Source Location
- Repository: DistrictDataLabs_Yellowbrick
- File: yellowbrick/text/freqdist.py
- Lines: 1-317
Signature
class FrequencyVisualizer(TextVisualizer):
def __init__(
self,
features,
ax=None,
n=50,
orient="h",
color=None,
**kwargs,
):
"""Frequency distribution bar chart for text terms."""
def freqdist(
features, X, y=None, ax=None, n=50, orient="h", color=None, show=True, **kwargs,
):
"""Quick method for one-off frequency distribution visualization."""
FreqDistVisualizer = FrequencyVisualizer # Backwards compatibility alias
Import
from yellowbrick.text import FreqDistVisualizer
from yellowbrick.text.freqdist import freqdist
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| features | list of str | Yes | Feature/term names from vectorizer |
| X | sparse matrix | Yes | Document-term matrix (fit) |
| n | int | No | Number of top terms to display (default: 50) |
| orient | str | No | Bar orientation: "h" or "v" (default: "h") |
Outputs
| Name | Type | Description |
|---|---|---|
| ax | matplotlib.Axes | Axes with frequency bar chart |
Usage Examples
from sklearn.feature_extraction.text import CountVectorizer
from yellowbrick.text import FreqDistVisualizer
from yellowbrick.datasets import load_hobbies
corpus = load_hobbies()
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(corpus.data)
viz = FreqDistVisualizer(vectorizer.get_feature_names_out(), n=20)
viz.fit(X)
viz.show()