Implementation:Mlflow Mlflow User Training Code

Knowledge Sources	MLflow MLflow API
Domains	ML_Ops, Experiment_Tracking
Last Updated	2026-02-13 20:00 GMT

Overview

Concrete pattern for the user-defined model training step that executes within an MLflow run context.

Description

User training code is the practitioner-authored logic that performs the actual machine learning computation within an MLflow experiment tracking workflow. This is not an MLflow API; rather, it is the user's own code that invokes a machine learning framework (scikit-learn, PyTorch, TensorFlow, XGBoost, or any other library) to train a model. The code runs inside an active MLflow run context, consuming the parameters that were logged and producing the trained model object and performance metrics that will be logged in subsequent steps.

MLflow is intentionally agnostic to the training framework. The only requirement is that the training code executes within the scope of an active run (typically inside a with mlflow.start_run() block) so that subsequent logging calls have a target run.

Usage

Place training code after parameter logging and before metric/artifact logging within the active run context. Use any ML framework appropriate for the task. For iterative training (neural networks, boosted trees with staged output), log intermediate metrics at regular intervals to enable live monitoring. The training code should produce a model object and computed metrics that can be passed to MLflow logging functions in the next workflow steps.

Code Reference

Source Location

Repository: N/A (user-defined code)
File: User's training script
Lines: N/A

Signature

# User-defined training -- no fixed signature.
# Common patterns include:

# scikit-learn
model.fit(X_train, y_train)

# PyTorch training loop
for epoch in range(num_epochs):
    for batch in dataloader:
        loss = criterion(model(batch.inputs), batch.targets)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

# TensorFlow / Keras
model.fit(X_train, y_train, epochs=num_epochs, validation_data=(X_val, y_val))

Import

# Framework-dependent; examples:
from sklearn.ensemble import RandomForestClassifier
import torch
import tensorflow as tf

I/O Contract

Inputs

Name	Type	Required	Description
Training data	array-like, DataLoader, Dataset, etc.	Yes	The dataset on which the model is trained. Format depends on the ML framework.
Hyperparameters	Various (int, float, str, etc.)	Yes	Configuration values such as learning rate, number of estimators, regularization strength, batch size, and epoch count. These should match the parameters previously logged to the active MLflow run.
Model object	Framework-specific model class	Yes	An initialized (untrained or partially trained) model instance.

Outputs

Name	Type	Description
Trained model	Framework-specific model object	The model with learned parameters, ready for evaluation, serialization, and artifact logging.
Training metrics	float values	Performance measurements (loss, accuracy, F1 score, etc.) computed during or after training, ready for metric logging.

Usage Examples

Basic Usage

import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

with mlflow.start_run():
    # Log parameters
    n_estimators = 100
    max_depth = 5
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)

    # === User training code (this step) ===
    model = RandomForestClassifier(
        n_estimators=n_estimators, max_depth=max_depth, random_state=42
    )
    model.fit(X_train, y_train)
    accuracy = model.score(X_test, y_test)
    # === End training code ===

    # Log results
    mlflow.log_metric("accuracy", accuracy)
    mlflow.sklearn.log_model(model, name="rf_model")

PyTorch Training Loop

import mlflow
import torch
import torch.nn as nn
from torch.utils.data import DataLoader

with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.001)
    mlflow.log_param("epochs", 10)

    model = MyNeuralNetwork()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    criterion = nn.CrossEntropyLoss()

    # === User training code ===
    for epoch in range(10):
        running_loss = 0.0
        for batch_inputs, batch_targets in train_loader:
            optimizer.zero_grad()
            outputs = model(batch_inputs)
            loss = criterion(outputs, batch_targets)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

        avg_loss = running_loss / len(train_loader)
        mlflow.log_metric("train_loss", avg_loss, step=epoch)
    # === End training code ===

Related Pages

Implements Principle

Principle:Mlflow_Mlflow_Training_Execution

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment