Implementation:Scikit learn Scikit learn ModelSelectionModule
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Model Selection |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for providing cross validation, hyperparameter tuning, and model selection utilities, provided by scikit-learn.
Description
The sklearn.model_selection module aggregates tools for model selection including cross-validation splitters (KFold, StratifiedKFold, GroupKFold, TimeSeriesSplit, LeaveOneOut, ShuffleSplit), hyperparameter search (GridSearchCV, RandomizedSearchCV), validation utilities (cross_val_score, cross_validate, learning_curve, validation_curve), train/test splitting, and threshold classifiers (FixedThresholdClassifier, TunedThresholdClassifierCV).
Usage
Use this module for splitting data into train/test sets, performing cross-validation, tuning hyperparameters, and evaluating model performance across different configurations.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/model_selection/__init__.py
Signature
# Module-level imports (selected):
from sklearn.model_selection._split import KFold, StratifiedKFold, train_test_split
from sklearn.model_selection._search import GridSearchCV, RandomizedSearchCV
from sklearn.model_selection._validation import cross_val_score, cross_validate
Import
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (varies) | N/A | N/A | Each function/class has its own parameters; see individual documentation |
Outputs
| Name | Type | Description |
|---|---|---|
| splits | tuple or iterator | Train/test index arrays from splitters |
| scores | ndarray | Cross-validation scores from cross_val_score |
| best_estimator_ | estimator | Best found estimator from GridSearchCV/RandomizedSearchCV |
Usage Examples
Basic Usage
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.datasets import load_iris
from sklearn.svm import SVC
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
clf = SVC(kernel="linear")
scores = cross_val_score(clf, X, y, cv=5)
print(f"Accuracy: {scores.mean():.2f} (+/- {scores.std():.2f})")