Implementation:Mlflow Mlflow Search Traces
| Knowledge Sources | |
|---|---|
| Domains | ML_Ops, LLM_Observability |
| Last Updated | 2026-02-13 20:00 GMT |
Overview
Concrete tool for querying, retrieving, and analyzing captured MLflow traces provided by the MLflow library.
Description
MLflow provides two primary APIs for accessing trace data: mlflow.get_trace for retrieving a single trace by ID and mlflow.search_traces for querying across trace collections with filtering, ordering, and field extraction.
The mlflow.get_trace function retrieves a single trace by its trace ID. It first checks the in-memory buffer for recently created traces and falls back to the tracking store if the trace is not found in memory. If the trace does not exist in either location, it returns None (optionally suppressing the warning via the silent parameter). This two-tier lookup ensures that traces are accessible immediately after creation, even before they are fully persisted.
The mlflow.search_traces function provides full query capabilities over trace collections. It supports SQL-like filter strings for predicate-based search, ordering clauses, maximum result limits, and field extraction for pulling specific span inputs and outputs into DataFrame columns. Results can be returned as either a pandas DataFrame (default when pandas is installed) or a list of Trace objects. The function handles automatic pagination internally, collecting all matching results up to max_results by iterating through pages from the tracking store.
Search can be scoped to specific experiments via locations, filtered by run ID or model ID, and configured to include or exclude span data via include_spans. When searching without explicit locations, the function defaults to the currently active experiment. For Databricks environments, locations can also reference Unity Catalog schemas in the format catalog_name.schema_name.
Usage
Use mlflow.get_trace when you have a specific trace ID and need to inspect its full span tree, such as during debugging or when following up on an alert. Use mlflow.search_traces when you need to find traces matching certain criteria, build evaluation datasets, compute aggregate metrics, or export trace data for analysis. For large result sets, consider using the MlflowClient.search_traces API directly for manual pagination control.
Code Reference
Source Location
- Repository: mlflow
- File:
mlflow/tracing/fluent.py - Lines (get_trace): L699-743
- Lines (search_traces): L761-973
Signature
# Single trace retrieval
mlflow.get_trace(
trace_id: str,
silent: bool = False,
) -> Trace | None
# Trace search
mlflow.search_traces(
experiment_ids: list[str] | None = None,
filter_string: str | None = None,
max_results: int | None = None,
order_by: list[str] | None = None,
extract_fields: list[str] | None = None,
run_id: str | None = None,
return_type: Literal["pandas", "list"] | None = None,
model_id: str | None = None,
include_spans: bool = True,
locations: list[str] | None = None,
) -> DataFrame | list[Trace]
Import
import mlflow
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| trace_id | str | Yes (get_trace) | The unique ID of the trace to retrieve. |
| silent | bool | No | If True, suppresses warnings when a trace is not found. Default False. |
| experiment_ids | list[str] | No | (Deprecated in favor of locations) List of experiment IDs to scope the search. |
| filter_string | str | No | SQL-like filter expression for predicate-based search. |
| max_results | int | No | Maximum number of traces to return. If None, returns all matching traces. |
| order_by | list[str] | No | List of ordering clauses, e.g., ["timestamp DESC"]. |
| extract_fields | list[str] | No | (Deprecated) Fields to extract from spans into DataFrame columns, e.g., ["span_name.outputs.field"]. |
| run_id | str | No | Filter traces by the associated MLflow run ID. |
| return_type | "pandas" or "list" | No | Return format. Defaults to "pandas" if pandas is installed, otherwise "list". |
| model_id | str | No | Filter traces associated with a specific model ID. |
| include_spans | bool | No | If True (default), includes full span data. If False, returns only trace metadata. |
| locations | list[str] | No | List of experiment IDs or UC schema references (catalog.schema) to search over. |
Outputs
| Name | Type | Description |
|---|---|---|
| (get_trace) | Trace or None | The Trace object matching the given ID, or None if not found. |
| (search_traces, pandas) | pandas.DataFrame | A DataFrame where each row is a trace with columns for trace fields and optionally extracted span fields. |
| (search_traces, list) | list[Trace] | A list of Trace objects matching the search criteria. |
Usage Examples
Basic Usage
import mlflow
# Retrieve a specific trace by ID
trace = mlflow.get_trace(trace_id="tr-abc123")
if trace:
print(trace.info.status)
for span in trace.data.spans:
print(span.name, span.status)
Search Traces as DataFrame
import mlflow
# Search all traces in the active experiment
df = mlflow.search_traces(return_type="pandas")
print(df.columns)
print(df.head())
Filtered Search with Ordering
import mlflow
# Search traces by run ID, return as list
traces = mlflow.search_traces(
run_id="run-xyz789",
max_results=50,
order_by=["timestamp DESC"],
return_type="list",
)
for trace in traces:
print(trace.info.trace_id, trace.info.status)
Search with Field Extraction
import mlflow
# Extract specific span fields into DataFrame columns
df = mlflow.search_traces(
extract_fields=[
"retriever.inputs.query",
"retriever.outputs",
"llm.outputs.response",
],
return_type="pandas",
)
Search Across Specific Locations
import mlflow
# Search across multiple experiments
traces = mlflow.search_traces(
locations=["exp-001", "exp-002"],
filter_string="status = 'OK'",
max_results=100,
return_type="list",
)