Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Mlflow Mlflow Batch Logging Size Limits

From Leeroopedia
Knowledge Sources
Domains Optimization, Experiment_Tracking
Last Updated 2026-02-13 20:00 GMT

Overview

Performance and correctness heuristic defining the maximum batch sizes for logging metrics, parameters, and tags to MLflow in a single API call.

Description

MLflow enforces strict size limits on batch logging operations. The `log_batch()` method internally splits large payloads into smaller chunks that respect these limits. Understanding these limits helps developers design efficient logging strategies — particularly when logging thousands of metrics or parameters from hyperparameter sweeps or distributed training.

Usage

Use this heuristic when you are logging large numbers of metrics, parameters, or tags and need to understand throughput limits. This is critical when building custom logging pipelines, autologging integrations, or when debugging "request too large" errors from the tracking server.

The Insight (Rule of Thumb)

  • Action: Use `log_batch()` instead of individual `log_metric()`/`log_param()` calls when logging more than a few items.
  • Value:
    • Max 100 params/tags per batch
    • Max 1000 metrics per batch
    • Max 1000 total entities per batch
    • Max 1MB request size per batch
    • Max 6000 characters per parameter value
    • Max 8000 characters per tag value
  • Trade-off: `log_batch()` with `synchronous=True` blocks until all chunked batches complete. With `synchronous=False`, returns a `RunOperations` future.

Reasoning

These limits exist to protect the tracking server from oversized requests that could cause timeouts or memory issues. The batch chunking logic in the MLflow client automatically handles splitting:

Code evidence from `mlflow/utils/validation.py:54-60`:

MAX_PARAMS_TAGS_PER_BATCH = 100
MAX_METRICS_PER_BATCH = 1000
MAX_DATASETS_PER_BATCH = 1000
MAX_ENTITIES_PER_BATCH = 1000
MAX_BATCH_LOG_REQUEST_SIZE = int(1e6)  # 1 MB
MAX_PARAM_VAL_LENGTH = 6000
MAX_TAG_VAL_LENGTH = 8000

Automatic chunking from `mlflow/tracking/_tracking_service/client.py:508-557`:

def log_batch(self, run_id, metrics=(), params=(), tags=(), synchronous=True):
    param_batches = chunk_list(params, MAX_PARAMS_TAGS_PER_BATCH)
    tag_batches = chunk_list(tags, MAX_PARAMS_TAGS_PER_BATCH)
    # When data is split into multiple batches, waits for all batches
    # when synchronous=True. Each batch returns run_operations
    # which are merged into a single RunOperations object.

Additional validation limits from `mlflow/utils/validation.py:61-69`:

MAX_EXPERIMENT_NAME_LENGTH = 500
MAX_EXPERIMENT_TAG_KEY_LENGTH = 250
MAX_EXPERIMENT_TAG_VAL_LENGTH = 5000
MAX_ENTITY_KEY_LENGTH = 250
MAX_MODEL_REGISTRY_TAG_KEY_LENGTH = 250
MAX_MODEL_REGISTRY_TAG_VALUE_LENGTH = 100_000
MAX_DATASET_NAME_SIZE = 500
MAX_DATASET_DIGEST_SIZE = 36

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment