Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Arize ai Phoenix Batch Span Annotation

From Leeroopedia
Revision as of 17:59, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Arize_ai_Phoenix_Batch_Span_Annotation.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains AI Observability, Batch Processing, Span Evaluation
Last Updated 2026-02-14 00:00 GMT

Overview

Batch span annotation is the practice of submitting multiple structured quality assessments to an observability platform in a single operation, enabling efficient annotation of spans at scale.

Description

While individual span annotation is suitable for one-at-a-time human review or real-time evaluation callbacks, many practical workflows require annotating hundreds or thousands of spans in a single operation. Batch span annotation addresses this need by accepting collections of annotations -- either as typed dictionaries or as pandas DataFrames -- and submitting them to the server in optimized chunks.

Key characteristics of batch annotation include:

  • Bulk Submission: Multiple annotations are sent in a single HTTP request, reducing network overhead compared to individual calls.
  • Chunked Processing: Large collections (especially DataFrames) are automatically split into manageable chunks (typically 100 rows) to avoid exceeding server limits and to provide incremental progress.
  • Flexible Input Formats: Annotations can be provided as iterables of typed dictionaries (SpanAnnotationData) for programmatic pipelines, or as pandas DataFrames for data-science-oriented workflows.
  • Synchronous and Asynchronous Modes: Batch operations support both fire-and-forget (async) and synchronous modes, where the latter returns the IDs of all inserted annotations.
  • Document-Level Annotations: Beyond span-level annotations, batch operations also support annotating individual retrieved documents within a span (e.g., scoring each document in a RAG retrieval step).

Usage

Use batch span annotation when:

  • Running automated evaluation pipelines that produce annotations for an entire project or time window at once.
  • Importing pre-computed evaluation results from offline analysis (e.g., LLM judge scores computed in a notebook).
  • Processing DataFrames that combine span IDs with evaluation scores, labels, and explanations.
  • Annotating retrieved documents within retrieval spans to assess individual document relevance.
  • Migrating or backfilling annotations from an external system into Phoenix.

Theoretical Basis

Batch annotation follows the bulk write pattern common in database and API design. The core data structure is:

SpanAnnotationData = {
    "span_id": str,              # Target span
    "name": str,                 # Annotation dimension
    "annotator_kind": str,       # "HUMAN" | "LLM" | "CODE"
    "result": {                  # Assessment values
        "label": str?,
        "score": float?,
        "explanation": str?
    },
    "metadata": dict?,           # Optional key-value metadata
    "identifier": str?           # Optional dedup key
}

The server processes each annotation using the same upsert semantics as individual annotations: the composite key (span_id, name, identifier) determines whether to insert a new record or update an existing one.

For DataFrame inputs, the mapping from DataFrame columns to SpanAnnotationData fields is:

DataFrame Column SpanAnnotationData Field Notes
span_id (column or index) span_id Required; can come from column or DataFrame index
name or annotation_name name Required unless global annotation_name is provided
annotator_kind annotator_kind Required unless global annotator_kind is provided
label result.label Optional
score result.score Optional
explanation result.explanation Optional
metadata metadata Optional
identifier identifier Optional

The chunking strategy (100 rows per request) balances throughput against server memory pressure and request timeout constraints.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment