Implementation:SeldonIO Seldon core Sklearn Pipeline Train And Serialize
Appearance
| Property | Value |
|---|---|
| Implementation Name | Sklearn_Pipeline_Train_And_Serialize |
| Type | API Doc |
| Overview | Concrete tool for training and serializing sklearn models provided by the scikit-learn and joblib libraries. |
| Implements Principle | SeldonIO_Seldon_core_Model_Artifact_Preparation |
| Workflow | Model_Deployment |
| Domains | MLOps, Model_Serialization |
| Source | samples/scripts/models/iris/train.py:L1-25
|
| External Dependencies | sklearn, joblib, mlserver_sklearn |
| Last Updated | 2026-02-13 00:00 GMT |
Description
This implementation demonstrates how to train a scikit-learn pipeline and serialize it using joblib for deployment on Seldon Core 2 with MLServer. The training script creates a LogisticRegression classifier wrapped in a Pipeline, fits it on the Iris dataset, and persists the fitted pipeline to a .joblib file. The resulting artifact, combined with a model-settings.json configuration, is ready for upload to a model storage location and deployment via the Seldon Model CRD.
Code Reference
Source: samples/scripts/models/iris/train.py:L1-25
import joblib
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn import datasets
def main():
clf = LogisticRegression(solver="liblinear", multi_class='ovr')
p = Pipeline([("clf", clf)])
p.fit(X, y)
filename_p = "model.joblib"
joblib.dump(p, filename_p)
if __name__ == "__main__":
iris = datasets.load_iris()
X, y = iris.data, iris.target
main()
model-settings.json:
{
"name": "iris",
"implementation": "mlserver_sklearn.SKLearnModel",
"parameters": {
"uri": "./model.joblib",
"version": "v0.1.0"
}
}
Key Parameters
| Parameter | Value | Description |
|---|---|---|
solver |
"liblinear" |
Optimization algorithm for LogisticRegression; suitable for small datasets |
multi_class |
"ovr" |
One-vs-rest strategy for multi-class classification |
implementation |
"mlserver_sklearn.SKLearnModel" |
MLServer runtime class for scikit-learn models |
uri |
"./model.joblib" |
Relative path to the serialized model artifact |
version |
"v0.1.0" |
Model version identifier for tracking |
I/O Contract
Inputs
| Input | Type | Description |
|---|---|---|
| Raw training data | sklearn Iris dataset | X shape [n_samples, 4] (sepal length, sepal width, petal length, petal width), y shape [n_samples] (target class: 0, 1, or 2) |
Outputs
| Output | Type | Description |
|---|---|---|
model.joblib |
Serialized artifact | Joblib-serialized sklearn Pipeline containing the fitted LogisticRegression classifier |
model-settings.json |
Configuration file | MLServer model settings specifying runtime implementation and artifact URI |
Usage Examples
Training and Serializing the Model
# Run the training script to produce model.joblib
python samples/scripts/models/iris/train.py
# Verify the artifact was created
ls -la model.joblib
Uploading to Remote Storage
# Upload model artifact and settings to GCS
gsutil cp model.joblib gs://seldon-models/mlserver/iris/
gsutil cp model-settings.json gs://seldon-models/mlserver/iris/
Loading the Serialized Model Locally
import joblib
# Load the serialized pipeline
pipeline = joblib.load("model.joblib")
# Verify predictions work
predictions = pipeline.predict([[5.1, 3.5, 1.4, 0.2]])
print(predictions) # [0]
Knowledge Sources
- Repository: https://github.com/SeldonIO/seldon-core
- Documentation: https://mlserver.readthedocs.io
Related Pages
- SeldonIO_Seldon_core_Sklearn_Pipeline_Train_And_Serialize implements SeldonIO_Seldon_core_Model_Artifact_Preparation
- SeldonIO_Seldon_core_Seldon_Model_CRD consumes output of SeldonIO_Seldon_core_Sklearn_Pipeline_Train_And_Serialize
- SeldonIO_Seldon_core_Model_Resource_Definition references artifacts from SeldonIO_Seldon_core_Sklearn_Pipeline_Train_And_Serialize
- Environment:SeldonIO_Seldon_core_Docker_Compose_Local_Environment
- Environment:SeldonIO_Seldon_core_Python_ML_Dependencies_Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment