Heuristic:Mlflow Mlflow Batch Logging Size Limits
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Experiment_Tracking |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Performance and correctness heuristic defining the maximum batch sizes for logging metrics, parameters, and tags to MLflow in a single API call.
Description
MLflow enforces strict size limits on batch logging operations. The `log_batch()` method internally splits large payloads into smaller chunks that respect these limits. Understanding these limits helps developers design efficient logging strategies — particularly when logging thousands of metrics or parameters from hyperparameter sweeps or distributed training.
Usage
Use this heuristic when you are logging large numbers of metrics, parameters, or tags and need to understand throughput limits. This is critical when building custom logging pipelines, autologging integrations, or when debugging "request too large" errors from the tracking server.
The Insight (Rule of Thumb)
- Action: Use `log_batch()` instead of individual `log_metric()`/`log_param()` calls when logging more than a few items.
- Value:
- Max 100 params/tags per batch
- Max 1000 metrics per batch
- Max 1000 total entities per batch
- Max 1MB request size per batch
- Max 6000 characters per parameter value
- Max 8000 characters per tag value
- Trade-off: `log_batch()` with `synchronous=True` blocks until all chunked batches complete. With `synchronous=False`, returns a `RunOperations` future.
Reasoning
These limits exist to protect the tracking server from oversized requests that could cause timeouts or memory issues. The batch chunking logic in the MLflow client automatically handles splitting:
Code evidence from `mlflow/utils/validation.py:54-60`:
MAX_PARAMS_TAGS_PER_BATCH = 100
MAX_METRICS_PER_BATCH = 1000
MAX_DATASETS_PER_BATCH = 1000
MAX_ENTITIES_PER_BATCH = 1000
MAX_BATCH_LOG_REQUEST_SIZE = int(1e6) # 1 MB
MAX_PARAM_VAL_LENGTH = 6000
MAX_TAG_VAL_LENGTH = 8000
Automatic chunking from `mlflow/tracking/_tracking_service/client.py:508-557`:
def log_batch(self, run_id, metrics=(), params=(), tags=(), synchronous=True):
param_batches = chunk_list(params, MAX_PARAMS_TAGS_PER_BATCH)
tag_batches = chunk_list(tags, MAX_PARAMS_TAGS_PER_BATCH)
# When data is split into multiple batches, waits for all batches
# when synchronous=True. Each batch returns run_operations
# which are merged into a single RunOperations object.
Additional validation limits from `mlflow/utils/validation.py:61-69`:
MAX_EXPERIMENT_NAME_LENGTH = 500
MAX_EXPERIMENT_TAG_KEY_LENGTH = 250
MAX_EXPERIMENT_TAG_VAL_LENGTH = 5000
MAX_ENTITY_KEY_LENGTH = 250
MAX_MODEL_REGISTRY_TAG_KEY_LENGTH = 250
MAX_MODEL_REGISTRY_TAG_VALUE_LENGTH = 100_000
MAX_DATASET_NAME_SIZE = 500
MAX_DATASET_DIGEST_SIZE = 36