Implementation:DistrictDataLabs Yellowbrick ValidationCurve Visualizer
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Model_Selection, Visualization |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for diagnosing the effect of a single hyperparameter on model performance via validation curve visualization, provided by the Yellowbrick library.
Description
The ValidationCurve visualizer wraps scikit-learn's sklearn.model_selection.validation_curve utility and produces a plot showing how training and cross-validated test scores vary across a range of values for a single hyperparameter. For each hyperparameter value, it computes k-fold cross-validated scores, then plots the mean score with a shaded region representing one standard deviation of variability. Two curves are drawn: one for the training score and one for the cross-validation score.
The class extends ModelVisualizer from the Yellowbrick base module. On calling fit(X, y), the visualizer delegates to scikit-learn's validation_curve function, stores the resulting training and test score arrays, computes their means and standard deviations, and calls draw() to render the plot. The x-axis can optionally use a logarithmic scale via the logx parameter, which is useful for hyperparameters that span several orders of magnitude (e.g. regularization coefficients).
Usage
Use this visualizer when you want to visually tune a single hyperparameter for any scikit-learn estimator that implements fit and predict (or score). It is appropriate for classifiers, regressors, and clusterers with a valid scoring metric.
Code Reference
Source Location
- Repository: yellowbrick
- File: yellowbrick/model_selection/validation_curve.py
- Class Lines: L34-293 (class), L157-171 (__init__), L196-246 (fit)
- Quick Method Lines: L300-431
Signature
class ValidationCurve(ModelVisualizer):
def __init__(
self,
estimator,
param_name,
param_range,
ax=None,
logx=False,
groups=None,
cv=None,
scoring=None,
n_jobs=1,
pre_dispatch="all",
markers='-d',
**kwargs
):
Import
from yellowbrick.model_selection import ValidationCurve
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| estimator | scikit-learn estimator | Yes | An object implementing fit and predict. Cloned for each validation.
|
| param_name | string | Yes | Name of the hyperparameter to vary. |
| param_range | array-like, shape (n_values,) | Yes | The values of the hyperparameter to evaluate. |
| ax | matplotlib.Axes | No | The axes object to plot the figure on. |
| logx | boolean | No | If True, plots x-axis with logarithmic scale. Default: False. |
| groups | array-like, shape (n_samples,) | No | Group labels for samples used in train/test splitting. |
| cv | int, CV generator, or iterable | No | Cross-validation splitting strategy. Default: None (3-fold). |
| scoring | string, callable, or None | No | Scoring metric. Default: None (estimator's default scorer). |
| n_jobs | integer | No | Number of parallel jobs. Default: 1. |
| pre_dispatch | integer or string | No | Number of predispatched jobs. Default: "all". |
| markers | string | No | Matplotlib marker style. Default: '-d'. |
The fit(X, y) method accepts:
| Name | Type | Required | Description |
|---|---|---|---|
| X | array-like, shape (n_samples, n_features) | Yes | Training feature matrix. |
| y | array-like, shape (n_samples,) | No | Target values. None for unsupervised learning. |
Outputs
| Name | Type | Description |
|---|---|---|
| train_scores_ | array, shape (n_ticks, n_cv_folds) | Raw scores on training sets for each param value and fold. |
| train_scores_mean_ | array, shape (n_ticks,) | Mean training score for each hyperparameter value. |
| train_scores_std_ | array, shape (n_ticks,) | Standard deviation of training scores for each hyperparameter value. |
| test_scores_ | array, shape (n_ticks, n_cv_folds) | Raw scores on test sets for each param value and fold. |
| test_scores_mean_ | array, shape (n_ticks,) | Mean cross-validated test score for each hyperparameter value. |
| test_scores_std_ | array, shape (n_ticks,) | Standard deviation of cross-validated test scores for each hyperparameter value. |
Usage Examples
Basic Usage
import numpy as np
from sklearn.svm import SVC
from yellowbrick.model_selection import ValidationCurve
# Define the hyperparameter range
param_range = np.logspace(-6, -1, 5)
# Create and fit the visualizer
viz = ValidationCurve(
SVC(), param_name="gamma", param_range=param_range, cv=5, scoring="accuracy"
)
viz.fit(X_train, y_train)
viz.show()
Quick Method
from yellowbrick.model_selection import validation_curve
from sklearn.svm import SVC
import numpy as np
validation_curve(
SVC(), X_train, y_train,
param_name="gamma", param_range=np.logspace(-6, -1, 5), cv=5
)