Overview
The Span Helpers module provides utility functions for regenerating OpenTelemetry span and trace IDs, converting DataFrames to span objects, and re-exports RAG data extraction helpers.
Description
This module serves as the main entry point for span manipulation utilities in the Phoenix client SDK. It defines the Span type alias (equal to v1.Span) and provides three core functions:
uniquify_spans() regenerates span and trace IDs for a sequence of Span objects while preserving parent-child relationships. It generates new valid OpenTelemetry-compliant 128-bit trace IDs and 64-bit span IDs, mapping old IDs to new ones consistently across the entire span collection. This is essential when creating spans via the client to avoid ID collisions.
uniquify_spans_dataframe() performs the same ID regeneration operation on a pandas DataFrame (as returned by get_spans_dataframe()), handling the flattened column format (e.g., context.trace_id, context.span_id, parent_id) and the DataFrame index.
dataframe_to_spans() converts a pandas DataFrame back into a list of Span objects, reconstructing the nested dictionary structure from flattened columns. It handles context fields, direct fields (name, span_kind, parent_id, status_code, start_time, end_time), events, and attributes. It requires timezone-aware timestamps and raises ValueError for naive timestamps.
The module also re-exports the RAG helpers get_retrieved_documents(), get_input_output_context(), async_get_retrieved_documents(), and async_get_input_output_context() from the rag submodule.
Usage
Use uniquify_spans() or uniquify_spans_dataframe() before inserting spans into Phoenix to guarantee unique OpenTelemetry IDs. Use dataframe_to_spans() when you need to convert a filtered or modified DataFrame of spans back into Span objects for use with the client API.
Code Reference
Source Location
Signature
Span = v1.Span
def uniquify_spans(
spans: Sequence[Span],
*,
in_place: bool = False,
) -> list[Span]: ...
def uniquify_spans_dataframe(
df: pd.DataFrame,
*,
in_place: bool = False,
) -> pd.DataFrame: ...
def dataframe_to_spans(
df: pd.DataFrame,
) -> list[Span]: ...
Import
from phoenix.client.helpers.spans import (
uniquify_spans,
uniquify_spans_dataframe,
dataframe_to_spans,
get_input_output_context,
get_retrieved_documents,
async_get_input_output_context,
async_get_retrieved_documents,
)
I/O Contract
uniquify_spans()
Inputs
| Name |
Type |
Required |
Description
|
| spans |
Sequence[Span] |
Yes |
A sequence of Span objects to regenerate IDs for
|
| in_place |
bool |
No |
If True, modifies original spans; if False (default), creates deep copies
|
Outputs
| Name |
Type |
Description
|
| return |
list[Span] |
Spans with regenerated trace and span IDs; parent-child relationships preserved
|
uniquify_spans_dataframe()
Inputs
| Name |
Type |
Required |
Description
|
| df |
pd.DataFrame |
Yes |
DataFrame with span data (typically from get_spans_dataframe)
|
| in_place |
bool |
No |
If True, modifies original DataFrame; if False (default), creates a deep copy
|
Outputs
| Name |
Type |
Description
|
| return |
pd.DataFrame |
DataFrame with regenerated IDs in context.trace_id, context.span_id, parent_id columns and index
|
dataframe_to_spans()
Inputs
| Name |
Type |
Required |
Description
|
| df |
pd.DataFrame |
Yes |
DataFrame with span data; start_time and end_time must be timezone-aware
|
Outputs
| Name |
Type |
Description
|
| return |
list[Span] |
Reconstructed Span objects with nested context, attributes, and events
|
Usage Examples
from phoenix.client import Client
from phoenix.client.helpers.spans import (
uniquify_spans,
uniquify_spans_dataframe,
dataframe_to_spans,
)
client = Client()
# Regenerate IDs for span objects
spans = [...] # existing spans
new_spans = uniquify_spans(spans)
client.spans.create_spans(project_identifier="my-project", spans=new_spans)
# Regenerate IDs for a DataFrame
df = client.spans.get_spans_dataframe(project_identifier="source-project")
new_df = uniquify_spans_dataframe(df)
# Convert DataFrame to span objects, filter, and re-insert
filtered_df = df[df["span_kind"] == "LLM"]
spans = dataframe_to_spans(filtered_df)
new_spans = uniquify_spans(spans)
client.spans.create_spans(project_identifier="target-project", spans=new_spans)
Related Pages