Workflow:DistrictDataLabs Yellowbrick Model Selection and Tuning
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Model_Selection, Hyperparameter_Tuning |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
End-to-end process for visually comparing machine learning models, tuning hyperparameters, and selecting optimal feature subsets using Yellowbrick's model selection visualizers.
Description
This workflow covers the model selection and hyperparameter tuning phase of the machine learning pipeline. It uses Yellowbrick's model selection visualizers to diagnose bias-variance tradeoffs, evaluate how model performance scales with training data, assess feature importance, and perform recursive feature elimination. The process helps data scientists make informed decisions about which model to deploy and which features to retain.
Key outputs:
- Validation curve showing performance across hyperparameter values
- Learning curve showing performance as training data increases
- Feature importances bar chart ranking features by contribution
- RFECV plot showing performance across feature subset sizes
- Cross-validation score comparison across folds
- Feature dropping curve showing degradation as features are removed
Usage
Execute this workflow after initial feature analysis and model prototyping, when you need to decide between multiple model candidates, tune hyperparameters, or reduce the feature set. This is the final step before committing to a model for production deployment.
Execution Steps
Step 1: Prepare Candidate Models
Identify a set of candidate models from appropriate scikit-learn model families. Prepare the dataset with appropriate preprocessing (encoding, scaling) and establish train/test splits with cross-validation folds.
Key considerations:
- Select models from different families (linear, tree-based, SVM, ensemble) for diversity
- Use consistent preprocessing across all candidates
- Establish a consistent cross-validation strategy (e.g., StratifiedKFold for classification)
Step 2: Evaluate Hyperparameter Sensitivity
Use the ValidationCurve visualizer to assess how a single hyperparameter affects model performance. The visualizer trains the model across a range of parameter values using cross-validation and plots training and validation scores.
What to look for:
- The gap between training and validation curves indicates overfitting
- Convergence of curves suggests the model has reached its capacity
- The optimal parameter value lies where validation score peaks
- Wide confidence bands indicate high variance across CV folds
Step 3: Assess Data Sufficiency with Learning Curves
Use the LearningCurve visualizer to evaluate how model performance scales with training set size. This reveals whether the model would benefit from more data or if performance has plateaued.
What to look for:
- Converging train and validation curves indicate sufficient data
- A large gap that persists as data increases suggests overfitting
- If validation score is still rising, more data may help
- Flat learning curves with low scores indicate underfitting
Step 4: Rank Feature Importances
Use the FeatureImportances visualizer to display the relative importance of each feature as determined by the model (using coef_ or feature_importances_ attributes). This helps identify which features drive predictions.
Key considerations:
- Works with models that expose coef_ (linear models) or feature_importances_ (tree-based)
- Features can be ranked and displayed horizontally or vertically
- The relative option normalizes importances to sum to 100%
- Low-importance features may be candidates for removal
Step 5: Perform Recursive Feature Elimination
Use the RFECV visualizer to systematically remove features and evaluate model performance at each subset size using cross-validation. This identifies the minimum feature set that maintains model quality.
What to look for:
- The optimal number of features where cross-validated score peaks
- Steep performance drops when removing certain features indicate critical features
- Plateau regions suggest redundant features that can be safely removed
Step 6: Compare Cross-validation Scores
Use the CVScores visualizer to compare cross-validated performance distributions across model candidates. This shows the score distribution across folds as a box-and-whisker or bar chart.
Key considerations:
- Compare median scores and variance across models
- Models with high median but high variance may be unreliable
- Low-variance models with competitive scores are preferred for production
Step 7: Select and Finalize Model
Based on the combined evidence from validation curves, learning curves, feature importances, RFECV, and cross-validation scores, select the final model and feature set. Retrain on the full training set and validate on the held-out test set.
Key considerations:
- Balance model complexity against performance gains
- Consider interpretability requirements for the deployment context
- Document the rationale for model and feature selection decisions
- Use quick methods for rapid final comparisons if needed