Principle:Arize ai Phoenix Batch Span Annotation

Knowledge Sources	Phoenix Batch Annotations API
Domains	AI Observability, Batch Processing, Span Evaluation
Last Updated	2026-02-14 00:00 GMT

Overview

Batch span annotation is the practice of submitting multiple structured quality assessments to an observability platform in a single operation, enabling efficient annotation of spans at scale.

Description

While individual span annotation is suitable for one-at-a-time human review or real-time evaluation callbacks, many practical workflows require annotating hundreds or thousands of spans in a single operation. Batch span annotation addresses this need by accepting collections of annotations -- either as typed dictionaries or as pandas DataFrames -- and submitting them to the server in optimized chunks.

Key characteristics of batch annotation include:

Bulk Submission: Multiple annotations are sent in a single HTTP request, reducing network overhead compared to individual calls.
Chunked Processing: Large collections (especially DataFrames) are automatically split into manageable chunks (typically 100 rows) to avoid exceeding server limits and to provide incremental progress.
Flexible Input Formats: Annotations can be provided as iterables of typed dictionaries (SpanAnnotationData) for programmatic pipelines, or as pandas DataFrames for data-science-oriented workflows.
Synchronous and Asynchronous Modes: Batch operations support both fire-and-forget (async) and synchronous modes, where the latter returns the IDs of all inserted annotations.
Document-Level Annotations: Beyond span-level annotations, batch operations also support annotating individual retrieved documents within a span (e.g., scoring each document in a RAG retrieval step).

Usage

Use batch span annotation when:

Running automated evaluation pipelines that produce annotations for an entire project or time window at once.
Importing pre-computed evaluation results from offline analysis (e.g., LLM judge scores computed in a notebook).
Processing DataFrames that combine span IDs with evaluation scores, labels, and explanations.
Annotating retrieved documents within retrieval spans to assess individual document relevance.
Migrating or backfilling annotations from an external system into Phoenix.

Theoretical Basis

Batch annotation follows the bulk write pattern common in database and API design. The core data structure is:

SpanAnnotationData = {
    "span_id": str,              # Target span
    "name": str,                 # Annotation dimension
    "annotator_kind": str,       # "HUMAN" | "LLM" | "CODE"
    "result": {                  # Assessment values
        "label": str?,
        "score": float?,
        "explanation": str?
    },
    "metadata": dict?,           # Optional key-value metadata
    "identifier": str?           # Optional dedup key
}

The server processes each annotation using the same upsert semantics as individual annotations: the composite key (span_id, name, identifier) determines whether to insert a new record or update an existing one.

For DataFrame inputs, the mapping from DataFrame columns to SpanAnnotationData fields is:

DataFrame Column	SpanAnnotationData Field	Notes
`span_id` (column or index)	`span_id`	Required; can come from column or DataFrame index
`name` or `annotation_name`	`name`	Required unless global `annotation_name` is provided
`annotator_kind`	`annotator_kind`	Required unless global `annotator_kind` is provided
`label`	`result.label`	Optional
`score`	`result.score`	Optional
`explanation`	`result.explanation`	Optional
`metadata`	`metadata`	Optional
`identifier`	`identifier`	Optional

The chunking strategy (100 rows per request) balances throughput against server memory pressure and request timeout constraints.

Related Pages

Implemented By

Implementation:Arize_ai_Phoenix_Log_Span_Annotations

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment