Principle:DistrictDataLabs Yellowbrick Intercluster Distance Mapping

Knowledge Sources	Yellowbrick Docs Yellowbrick Caliński and Harabasz 1974 van der Maaten and Hinton 2008 (t-SNE)
Domains	Machine_Learning, Clustering, Model_Evaluation
Last Updated	2026-02-08 00:00 GMT

Overview

Intercluster Distance Mapping is a technique for visualizing the relative positions and sizes of clusters by embedding their high-dimensional centroids into a two-dimensional space while preserving inter-centroid distances.

Description

When working with clustering algorithms in high-dimensional feature spaces, it is difficult to assess whether clusters are well-separated or overlapping. Intercluster Distance Mapping addresses this by projecting cluster centers from their original high-dimensional space into two dimensions using a dimensionality reduction technique. The projection preserves the relative distances between cluster centers: clusters that are close together in the original feature space appear close in the 2D embedding, and those that are far apart appear distant.

In the resulting visualization, each cluster is represented as a circle positioned at its embedded center. The size of each circle encodes a scoring metric -- typically membership (the count of data points assigned to that cluster). This gives an immediate sense of both the spatial relationships between clusters and their relative importance. Large, well-separated circles indicate a healthy clustering with distinct, well-populated groups. Overlapping circles may suggest that those clusters are difficult to distinguish in the feature space, though it is important to note that overlap in the 2D embedding does not necessarily imply overlap in the original feature space due to the information loss inherent in dimensionality reduction.

Two embedding algorithms are commonly used: Multidimensional Scaling (MDS), which directly minimizes the stress between pairwise distances in the original and embedded spaces, and t-SNE, which preserves local neighborhood structure through a probabilistic approach. MDS is the default choice because it is deterministic (given a fixed random state) and emphasizes global distance preservation, making it well-suited for showing how cluster centers relate to one another.

Usage

Use Intercluster Distance Mapping when:

You want to understand the spatial relationships between cluster centers in a high-dimensional space.
You need to identify clusters that may be too close together and potentially redundant or poorly separated.
You want to visualize the relative sizes (memberships) of clusters to assess cluster balance.
You have already determined a value of k and want to evaluate the resulting clustering structure.

Limitations:

The 2D embedding inevitably loses information from the original high-dimensional space. Overlap in the visualization does not prove overlap in feature space.
Requires the clustering algorithm to produce explicit cluster_centers_ (e.g., k-means, mini-batch k-means). Hierarchical or density-based methods may not be directly supported.
The embedding can be sensitive to the random state, especially with t-SNE.

Theoretical Basis

Dimensionality Reduction of Cluster Centers

Given $k$ cluster centers ${μ_{1}, μ_{2}, \dots, μ_{k}}$ in $ℝ^{d}$ , the goal is to find a mapping $f : ℝ^{d} \to ℝ^{2}$ such that pairwise distances are approximately preserved:

$‖ f (μ_{i}) - f (μ_{j}) ‖_{2} \approx ‖ μ_{i} - μ_{j} ‖_{2} \forall i, j$

Multidimensional Scaling (MDS)

MDS achieves this by minimizing a stress function. Classical MDS minimizes:

$Stress = \sqrt{\frac{\sum_{i < j} (d_{i j} - {\hat{d}}_{i j})^{2}}{\sum_{i < j} d_{i j}^{2}}}$

where $d_{i j}$ is the distance between centers $i$ and $j$ in the original space and ${\hat{d}}_{i j}$ is the distance in the embedded 2D space. The result is a set of 2D coordinates that best preserves the original inter-centroid distances.

Cluster Sizing by Membership

Each cluster's visual size is determined by its membership count (the number of data points assigned to it):

${size}_{j} = | C_{j} | = \sum_{i = 1}^{n} 𝟙 [{label}_{i} = j]$

These counts are scaled to marker areas using a proportional sizing function that maps raw scores to a range between a minimum and maximum marker size, providing an intuitive representation of relative cluster populations.

Related Pages

Implemented By

Implementation:DistrictDataLabs_Yellowbrick_InterclusterDistance_Visualizer

Related Principles

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment