| Property |
Value
|
| Implementation Name |
Sklearn_Pipeline_Train_And_Serialize
|
| Type |
API Doc
|
| Overview |
Concrete tool for training and serializing sklearn models provided by the scikit-learn and joblib libraries.
|
| Implements Principle |
SeldonIO_Seldon_core_Model_Artifact_Preparation
|
| Workflow |
Model_Deployment
|
| Domains |
MLOps, Model_Serialization
|
| Source |
samples/scripts/models/iris/train.py:L1-25
|
| External Dependencies |
sklearn, joblib, mlserver_sklearn
|
| Last Updated |
2026-02-13 00:00 GMT
|
Description
This implementation demonstrates how to train a scikit-learn pipeline and serialize it using joblib for deployment on Seldon Core 2 with MLServer. The training script creates a LogisticRegression classifier wrapped in a Pipeline, fits it on the Iris dataset, and persists the fitted pipeline to a .joblib file. The resulting artifact, combined with a model-settings.json configuration, is ready for upload to a model storage location and deployment via the Seldon Model CRD.
Code Reference
Source: samples/scripts/models/iris/train.py:L1-25
import joblib
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn import datasets
def main():
clf = LogisticRegression(solver="liblinear", multi_class='ovr')
p = Pipeline([("clf", clf)])
p.fit(X, y)
filename_p = "model.joblib"
joblib.dump(p, filename_p)
if __name__ == "__main__":
iris = datasets.load_iris()
X, y = iris.data, iris.target
main()
model-settings.json:
{
"name": "iris",
"implementation": "mlserver_sklearn.SKLearnModel",
"parameters": {
"uri": "./model.joblib",
"version": "v0.1.0"
}
}
Key Parameters
| Parameter |
Value |
Description
|
solver |
"liblinear" |
Optimization algorithm for LogisticRegression; suitable for small datasets
|
multi_class |
"ovr" |
One-vs-rest strategy for multi-class classification
|
implementation |
"mlserver_sklearn.SKLearnModel" |
MLServer runtime class for scikit-learn models
|
uri |
"./model.joblib" |
Relative path to the serialized model artifact
|
version |
"v0.1.0" |
Model version identifier for tracking
|
I/O Contract
Inputs
| Input |
Type |
Description
|
| Raw training data |
sklearn Iris dataset |
X shape [n_samples, 4] (sepal length, sepal width, petal length, petal width), y shape [n_samples] (target class: 0, 1, or 2)
|
Outputs
| Output |
Type |
Description
|
model.joblib |
Serialized artifact |
Joblib-serialized sklearn Pipeline containing the fitted LogisticRegression classifier
|
model-settings.json |
Configuration file |
MLServer model settings specifying runtime implementation and artifact URI
|
Usage Examples
Training and Serializing the Model
# Run the training script to produce model.joblib
python samples/scripts/models/iris/train.py
# Verify the artifact was created
ls -la model.joblib
Uploading to Remote Storage
# Upload model artifact and settings to GCS
gsutil cp model.joblib gs://seldon-models/mlserver/iris/
gsutil cp model-settings.json gs://seldon-models/mlserver/iris/
Loading the Serialized Model Locally
import joblib
# Load the serialized pipeline
pipeline = joblib.load("model.joblib")
# Verify predictions work
predictions = pipeline.predict([[5.1, 3.5, 1.4, 0.2]])
print(predictions) # [0]
Knowledge Sources
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.