Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlflow Mlflow User Training Code

From Leeroopedia
Knowledge Sources
Domains ML_Ops, Experiment_Tracking
Last Updated 2026-02-13 20:00 GMT

Overview

Concrete pattern for the user-defined model training step that executes within an MLflow run context.

Description

User training code is the practitioner-authored logic that performs the actual machine learning computation within an MLflow experiment tracking workflow. This is not an MLflow API; rather, it is the user's own code that invokes a machine learning framework (scikit-learn, PyTorch, TensorFlow, XGBoost, or any other library) to train a model. The code runs inside an active MLflow run context, consuming the parameters that were logged and producing the trained model object and performance metrics that will be logged in subsequent steps.

MLflow is intentionally agnostic to the training framework. The only requirement is that the training code executes within the scope of an active run (typically inside a with mlflow.start_run() block) so that subsequent logging calls have a target run.

Usage

Place training code after parameter logging and before metric/artifact logging within the active run context. Use any ML framework appropriate for the task. For iterative training (neural networks, boosted trees with staged output), log intermediate metrics at regular intervals to enable live monitoring. The training code should produce a model object and computed metrics that can be passed to MLflow logging functions in the next workflow steps.

Code Reference

Source Location

  • Repository: N/A (user-defined code)
  • File: User's training script
  • Lines: N/A

Signature

# User-defined training -- no fixed signature.
# Common patterns include:

# scikit-learn
model.fit(X_train, y_train)

# PyTorch training loop
for epoch in range(num_epochs):
    for batch in dataloader:
        loss = criterion(model(batch.inputs), batch.targets)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

# TensorFlow / Keras
model.fit(X_train, y_train, epochs=num_epochs, validation_data=(X_val, y_val))

Import

# Framework-dependent; examples:
from sklearn.ensemble import RandomForestClassifier
import torch
import tensorflow as tf

I/O Contract

Inputs

Name Type Required Description
Training data array-like, DataLoader, Dataset, etc. Yes The dataset on which the model is trained. Format depends on the ML framework.
Hyperparameters Various (int, float, str, etc.) Yes Configuration values such as learning rate, number of estimators, regularization strength, batch size, and epoch count. These should match the parameters previously logged to the active MLflow run.
Model object Framework-specific model class Yes An initialized (untrained or partially trained) model instance.

Outputs

Name Type Description
Trained model Framework-specific model object The model with learned parameters, ready for evaluation, serialization, and artifact logging.
Training metrics float values Performance measurements (loss, accuracy, F1 score, etc.) computed during or after training, ready for metric logging.

Usage Examples

Basic Usage

import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

with mlflow.start_run():
    # Log parameters
    n_estimators = 100
    max_depth = 5
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)

    # === User training code (this step) ===
    model = RandomForestClassifier(
        n_estimators=n_estimators, max_depth=max_depth, random_state=42
    )
    model.fit(X_train, y_train)
    accuracy = model.score(X_test, y_test)
    # === End training code ===

    # Log results
    mlflow.log_metric("accuracy", accuracy)
    mlflow.sklearn.log_model(model, name="rf_model")

PyTorch Training Loop

import mlflow
import torch
import torch.nn as nn
from torch.utils.data import DataLoader

with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.001)
    mlflow.log_param("epochs", 10)

    model = MyNeuralNetwork()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    criterion = nn.CrossEntropyLoss()

    # === User training code ===
    for epoch in range(10):
        running_loss = 0.0
        for batch_inputs, batch_targets in train_loader:
            optimizer.zero_grad()
            outputs = model(batch_inputs)
            loss = criterion(outputs, batch_targets)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()

        avg_loss = running_loss / len(train_loader)
        mlflow.log_metric("train_loss", avg_loss, step=epoch)
    # === End training code ===

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment