Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:DistrictDataLabs Yellowbrick MissingValuesDispersion

From Leeroopedia


Knowledge Sources
Domains Data_Quality, Visualization
Last Updated 2026-02-08 05:00 GMT

Overview

Concrete tool for visualizing the spatial distribution of missing values across features and samples, provided by the Yellowbrick contrib module.

Description

The MissingValuesDispersion renders a scatter-style plot showing the exact locations of missing (NaN) values in a dataset. Each feature is a row, and markers indicate which sample indices have missing values. This reveals patterns in missingness such as block-missing or random-missing structures.

Usage

Import this visualizer when you need to understand the spatial pattern of missing data, not just the counts. It complements MissingValuesBar by showing where values are missing rather than how many.

Code Reference

Source Location

Signature

class MissingValuesDispersion(MissingDataVisualizer):
    def __init__(self, alpha=0.5, marker="|", classes=None, **kwargs):
        """Missing values dispersion plot visualizer."""

def missing_dispersion(X, y=None, ax=None, classes=None, alpha=0.5, marker="|", **kwargs):
    """Quick method for one-off missing values dispersion visualization."""

Import

from yellowbrick.contrib.missing import MissingValuesDispersion
from yellowbrick.contrib.missing.dispersion import missing_dispersion

I/O Contract

Inputs

Name Type Required Description
X array-like or DataFrame Yes Feature data with potential NaN values
y array-like No Target labels for coloring
alpha float No Marker transparency (default: 0.5)
marker str No ")

Outputs

Name Type Description
ax matplotlib.Axes Axes with dispersion scatter plot

Usage Examples

import numpy as np
import pandas as pd
from yellowbrick.contrib.missing import MissingValuesDispersion

df = pd.DataFrame({
    "A": [1, np.nan, 3, np.nan, 5, 6, np.nan, 8],
    "B": [np.nan, 2, 3, 4, 5, np.nan, 7, 8],
    "C": [1, 2, np.nan, 4, np.nan, 6, 7, np.nan],
})

viz = MissingValuesDispersion()
viz.fit(df.values)
viz.show()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment