Implementation:Mlflow Mlflow User Training Code
| Knowledge Sources | |
|---|---|
| Domains | ML_Ops, Experiment_Tracking |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Concrete pattern for the user-defined model training step that executes within an MLflow run context.
Description
User training code is the practitioner-authored logic that performs the actual machine learning computation within an MLflow experiment tracking workflow. This is not an MLflow API; rather, it is the user's own code that invokes a machine learning framework (scikit-learn, PyTorch, TensorFlow, XGBoost, or any other library) to train a model. The code runs inside an active MLflow run context, consuming the parameters that were logged and producing the trained model object and performance metrics that will be logged in subsequent steps.
MLflow is intentionally agnostic to the training framework. The only requirement is that the training code executes within the scope of an active run (typically inside a with mlflow.start_run() block) so that subsequent logging calls have a target run.
Usage
Place training code after parameter logging and before metric/artifact logging within the active run context. Use any ML framework appropriate for the task. For iterative training (neural networks, boosted trees with staged output), log intermediate metrics at regular intervals to enable live monitoring. The training code should produce a model object and computed metrics that can be passed to MLflow logging functions in the next workflow steps.
Code Reference
Source Location
- Repository: N/A (user-defined code)
- File: User's training script
- Lines: N/A
Signature
# User-defined training -- no fixed signature.
# Common patterns include:
# scikit-learn
model.fit(X_train, y_train)
# PyTorch training loop
for epoch in range(num_epochs):
for batch in dataloader:
loss = criterion(model(batch.inputs), batch.targets)
loss.backward()
optimizer.step()
optimizer.zero_grad()
# TensorFlow / Keras
model.fit(X_train, y_train, epochs=num_epochs, validation_data=(X_val, y_val))
Import
# Framework-dependent; examples:
from sklearn.ensemble import RandomForestClassifier
import torch
import tensorflow as tf
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| Training data | array-like, DataLoader, Dataset, etc. | Yes | The dataset on which the model is trained. Format depends on the ML framework. |
| Hyperparameters | Various (int, float, str, etc.) | Yes | Configuration values such as learning rate, number of estimators, regularization strength, batch size, and epoch count. These should match the parameters previously logged to the active MLflow run. |
| Model object | Framework-specific model class | Yes | An initialized (untrained or partially trained) model instance. |
Outputs
| Name | Type | Description |
|---|---|---|
| Trained model | Framework-specific model object | The model with learned parameters, ready for evaluation, serialization, and artifact logging. |
| Training metrics | float values | Performance measurements (loss, accuracy, F1 score, etc.) computed during or after training, ready for metric logging. |
Usage Examples
Basic Usage
import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
iris.data, iris.target, test_size=0.2, random_state=42
)
with mlflow.start_run():
# Log parameters
n_estimators = 100
max_depth = 5
mlflow.log_param("n_estimators", n_estimators)
mlflow.log_param("max_depth", max_depth)
# === User training code (this step) ===
model = RandomForestClassifier(
n_estimators=n_estimators, max_depth=max_depth, random_state=42
)
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
# === End training code ===
# Log results
mlflow.log_metric("accuracy", accuracy)
mlflow.sklearn.log_model(model, name="rf_model")
PyTorch Training Loop
import mlflow
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
with mlflow.start_run():
mlflow.log_param("learning_rate", 0.001)
mlflow.log_param("epochs", 10)
model = MyNeuralNetwork()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
# === User training code ===
for epoch in range(10):
running_loss = 0.0
for batch_inputs, batch_targets in train_loader:
optimizer.zero_grad()
outputs = model(batch_inputs)
loss = criterion(outputs, batch_targets)
loss.backward()
optimizer.step()
running_loss += loss.item()
avg_loss = running_loss / len(train_loader)
mlflow.log_metric("train_loss", avg_loss, step=epoch)
# === End training code ===