Implementation:Scikit learn Scikit learn StackingClassifier Init
Appearance
Overview
Concrete tool for creating a stacking ensemble with a meta-learner provided by scikit-learn. The Template:Code stacks the output of individual base estimators and uses a final classifier to compute the final prediction. The base estimators are fitted on the full training data, while the final estimator is trained using cross-validated predictions of the base estimators (generated via Template:Code).
Constructor Signature
from sklearn.ensemble import StackingClassifier
StackingClassifier(
estimators,
final_estimator=None,
*,
cv=None,
stack_method="auto",
n_jobs=None,
passthrough=False,
verbose=0,
)
Parameters
- estimators (list of (str, estimator) tuples) -- Base estimators to be stacked. Each element is a tuple of a name string and an estimator instance. An estimator can be set to Template:Code using Template:Code. The type of estimator is generally expected to be a classifier, though regressors can be passed for use cases such as ordinal regression.
- final_estimator (estimator, default=None) -- A classifier used to combine the base estimators. The default is Template:Code.
- cv (int, cross-validation generator, iterable, or "prefit", default=None) -- Determines the cross-validation splitting strategy used in Template:Code to train the final estimator. Possible inputs:
- None: default 5-fold cross-validation.
- integer: number of folds in a (Stratified) KFold.
- An object to be used as a cross-validation generator.
- An iterable yielding (train, test) splits.
- Template:Code: assumes the base estimators are already fitted and will not be refitted. The final estimator is trained on the base estimators' predictions on the full training set (risk of overfitting).
- stack_method ({"auto", "predict_proba", "decision_function", "predict"}, default="auto") -- The method called on each base estimator to generate meta-features. If "auto", tries Template:Code, then Template:Code, then Template:Code in that order.
- n_jobs (int, default=None) -- Number of jobs to run in parallel for Template:Code of all base estimators. None means 1 unless in a Template:Code context. -1 means using all processors.
- passthrough (bool, default=False) -- When True, the final estimator is trained on both the base estimators' predictions and the original training data. When False, only the predictions are used as meta-features.
- verbose (int, default=0) -- Verbosity level.
Fitted Attributes
- classes_ -- Class labels (ndarray of shape Template:Code or list of ndarray for multilabel).
- estimators_ -- The elements of the Template:Code parameter, having been fitted on the training data. Estimators set to "drop" are excluded. When Template:Code, these are set to the provided estimators without refitting.
- named_estimators_ -- A Template:Code object allowing access to fitted sub-estimators by name.
- n_features_in_ -- Number of features seen during Template:Code (only defined if the underlying estimators expose this attribute).
- feature_names_in_ -- Names of features seen during Template:Code (only defined if the underlying estimators expose this attribute).
- final_estimator_ -- The classifier fit on the output of Template:Code, responsible for final predictions.
- stack_method_ -- The method used by each base estimator to generate meta-features (list of str).
Example Usage
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier, StackingClassifier
from sklearn.svm import LinearSVC
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
X, y = load_iris(return_X_y=True)
estimators = [
("rf", RandomForestClassifier(n_estimators=10, random_state=42)),
("svr", make_pipeline(StandardScaler(), LinearSVC(random_state=42))),
]
clf = StackingClassifier(
estimators=estimators,
final_estimator=LogisticRegression(),
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, stratify=y, random_state=42
)
clf.fit(X_train, y_train).score(X_test, y_test)
# 0.9...
Source Location
Template:Code, class Template:Code (lines 422-839).
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment