Implementation:DistrictDataLabs Yellowbrick PosTagVisualizer
| Knowledge Sources | |
|---|---|
| Domains | NLP, Visualization |
| Last Updated | 2026-02-08 05:00 GMT |
Overview
Concrete tool for visualizing part-of-speech tag frequencies in text documents, supporting both Penn Treebank and Universal tagsets via NLTK or spaCy parsers.
Description
The PosTagVisualizer parses documents to extract part-of-speech (POS) tags and displays their frequency distribution as bar charts. It supports two tagsets (Penn Treebank and Universal), two NLP parsers (NLTK and spaCy), and both single-bar and stacked-bar display modes. It can show either raw counts or normalized frequencies.
Usage
Import this visualizer when analyzing the grammatical structure of text documents. It requires either NLTK or spaCy to be installed for POS tagging.
Code Reference
Source Location
- Repository: DistrictDataLabs_Yellowbrick
- File: yellowbrick/text/postag.py
- Lines: 1-676
Signature
class PosTagVisualizer(TextVisualizer):
def __init__(
self,
ax=None,
tagset="penn_treebank",
colormap=None,
colors=None,
frequency=False,
stack=False,
parser=None,
**kwargs,
):
"""Part-of-speech tag frequency visualizer."""
def postag(
X, y=None, ax=None, tagset="penn_treebank", colormap=None, colors=None,
frequency=False, stack=False, parser=None, show=True, **kwargs,
):
"""Quick method for one-off POS tag visualization."""
Import
from yellowbrick.text import PosTagVisualizer
from yellowbrick.text.postag import postag
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | list of str | Yes | Documents to parse for POS tags |
| tagset | str | No | "penn_treebank" or "universal" (default: "penn_treebank") |
| parser | str | No | "nltk" or "spacy" (auto-detected) |
| frequency | bool | No | Show normalized frequencies (default: False) |
| stack | bool | No | Use stacked bar chart (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| ax | matplotlib.Axes | Axes with POS tag bar chart |
Usage Examples
from yellowbrick.text import PosTagVisualizer
documents = [
"The quick brown fox jumped over the lazy dog.",
"She sells sea shells by the sea shore.",
]
viz = PosTagVisualizer(tagset="penn_treebank")
viz.fit(documents)
viz.show()