Implementation:Arize ai Phoenix Get Span Annotations Dataframe
| Knowledge Sources | |
|---|---|
| Domains | AI Observability, Data Retrieval, Span Evaluation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for retrieving span annotations as pandas DataFrames or typed lists, provided by the arize-phoenix-client package.
Description
The Spans.get_span_annotations_dataframe() method fetches span annotations from the v1/projects/{project_identifier}/span_annotations REST endpoint and returns them as a pandas DataFrame. Span IDs can be provided directly, extracted from a spans DataFrame (via the context.span_id column), or extracted from a list of Span objects. The method handles pagination transparently using cursor-based iteration and batches span IDs into groups of 100 per request.
The Spans.get_span_annotations() method provides the same functionality but returns a list[SpanAnnotation] instead of a DataFrame. It accepts span_ids or spans as input (but not a spans DataFrame).
Both methods support filtering by annotation name inclusion/exclusion lists. By default, annotations named "note" are excluded since notes are reserved for UI-added commentary.
Usage
Use get_span_annotations_dataframe() when:
- Performing data analysis on annotation results using pandas (aggregations, pivots, joins).
- Chaining with
get_spans_dataframe()to combine span data and annotation data. - Exporting annotation data for visualization or reporting.
Use get_span_annotations() when:
- Processing annotations programmatically in application logic.
- Building pipelines that consume individual annotation objects.
- Working in environments where pandas is not available or desired.
Code Reference
Source Location
- Repository: Phoenix
- File:
packages/phoenix-client/src/phoenix/client/resources/spans/__init__.py - Lines: 194-318 (get_span_annotations_dataframe), 320-418 (get_span_annotations)
Signature
def get_span_annotations_dataframe(
self,
*,
spans_dataframe: Optional["pd.DataFrame"] = None,
span_ids: Optional[Iterable[str]] = None,
spans: Optional[Iterable[v1.Span]] = None,
project_identifier: str = "default",
include_annotation_names: Optional[Sequence[str]] = None,
exclude_annotation_names: Optional[Sequence[str]] = None,
limit: int = 1000,
timeout: Optional[int] = DEFAULT_TIMEOUT_IN_SECONDS,
) -> "pd.DataFrame":
...
def get_span_annotations(
self,
*,
span_ids: Optional[Iterable[str]] = None,
spans: Optional[Iterable[v1.Span]] = None,
project_identifier: str,
include_annotation_names: Optional[Sequence[str]] = None,
exclude_annotation_names: Optional[Sequence[str]] = None,
limit: int = 1000,
timeout: Optional[int] = DEFAULT_TIMEOUT_IN_SECONDS,
) -> list[SpanAnnotation]:
...
Import
from phoenix.client import Client
client = Client()
# Access via: client.spans.get_span_annotations_dataframe(...)
# Access via: client.spans.get_span_annotations(...)
I/O Contract
Inputs (get_span_annotations_dataframe)
| Name | Type | Required | Description |
|---|---|---|---|
| spans_dataframe | Optional[pd.DataFrame] |
Exactly one of spans_dataframe, span_ids, or spans | A DataFrame (typically from get_spans_dataframe()) with a context.span_id or span_id column. Span IDs are extracted and deduplicated.
|
| span_ids | Optional[Iterable[str]] |
Exactly one of spans_dataframe, span_ids, or spans | An iterable of OpenTelemetry span ID strings. Deduplicated internally. |
| spans | Optional[Iterable[v1.Span]] |
Exactly one of spans_dataframe, span_ids, or spans | A list of Span objects (typically from get_spans()). Span IDs are extracted from context.span_id.
|
| project_identifier | str |
No | The project name or ID for the API path. Defaults to "default".
|
| include_annotation_names | Optional[Sequence[str]] |
No | If provided, only annotations with these names are returned. Cannot be used simultaneously with exclude. |
| exclude_annotation_names | Optional[Sequence[str]] |
No | Annotation names to exclude. Defaults to ["note"] when not provided.
|
| limit | int |
No | Maximum number of annotations per pagination page. Defaults to 1000. |
| timeout | Optional[int] |
No | Request timeout in seconds. Defaults to 5 seconds. |
Inputs (get_span_annotations)
| Name | Type | Required | Description |
|---|---|---|---|
| span_ids | Optional[Iterable[str]] |
Exactly one of span_ids or spans | An iterable of span ID strings. |
| spans | Optional[Iterable[v1.Span]] |
Exactly one of span_ids or spans | A list of Span objects. |
| project_identifier | str |
Yes | The project name or ID for the API path. |
| include_annotation_names | Optional[Sequence[str]] |
No | Filter to include only these annotation names. |
| exclude_annotation_names | Optional[Sequence[str]] |
No | Filter to exclude these annotation names. Defaults to ["note"].
|
| limit | int |
No | Pagination page size. Defaults to 1000. |
| timeout | Optional[int] |
No | Request timeout in seconds. Defaults to 5. |
Outputs (get_span_annotations_dataframe)
| Name | Type | Description |
|---|---|---|
| (return) | pd.DataFrame |
A DataFrame indexed by span_id with columns: annotation_name, annotator_kind, label, score, explanation, metadata, created_at, updated_at. The nested result dict is flattened into top-level columns. Returns an empty DataFrame if no span IDs are provided.
|
Outputs (get_span_annotations)
| Name | Type | Description |
|---|---|---|
| (return) | list[SpanAnnotation] |
A list of SpanAnnotation typed dictionaries. Each contains: span_id, name, annotator_kind, result (with label, score, explanation), metadata, created_at, updated_at. Returns an empty list if no span IDs are provided.
|
Usage Examples
Retrieve Annotations for Specific Spans
from phoenix.client import Client
client = Client()
df = client.spans.get_span_annotations_dataframe(
span_ids=["span_001", "span_002", "span_003"],
project_identifier="my-project",
)
print(df[["annotation_name", "label", "score"]])
Chain with Spans DataFrame
from phoenix.client import Client
client = Client()
# First, get spans
spans_df = client.spans.get_spans_dataframe(
project_identifier="my-project",
limit=500,
)
# Then, get annotations for those spans
annotations_df = client.spans.get_span_annotations_dataframe(
spans_dataframe=spans_df,
project_identifier="my-project",
)
print(f"Retrieved {len(annotations_df)} annotations for {len(spans_df)} spans")
Filter by Annotation Name
from phoenix.client import Client
client = Client()
# Only retrieve "relevance" and "correctness" annotations
df = client.spans.get_span_annotations_dataframe(
span_ids=["span_001", "span_002"],
project_identifier="my-project",
include_annotation_names=["relevance", "correctness"],
)
Retrieve Annotations as a List
from phoenix.client import Client
client = Client()
annotations = client.spans.get_span_annotations(
span_ids=["span_001", "span_002"],
project_identifier="my-project",
)
for anno in annotations:
print(
f"Span {anno['span_id']}: "
f"{anno['name']} = {anno['result'].get('label')} "
f"(score: {anno['result'].get('score')})"
)
Retrieve from Span Objects
from phoenix.client import Client
client = Client()
# Get spans as objects first
spans = client.spans.get_spans(
project_identifier="my-project",
limit=100,
)
# Pass span objects directly to get their annotations
annotations = client.spans.get_span_annotations(
spans=spans,
project_identifier="my-project",
include_annotation_names=["quality"],
)