Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Arize ai Phoenix Evals Utils

From Leeroopedia
Revision as of 12:03, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Arize_ai_Phoenix_Evals_Utils.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Overview

The Evals Utils module provides utility decorators, input remapping functions, data formatting helpers, and re-exports of legacy utility symbols for the Phoenix evaluator framework. It resides at phoenix.evals.utils and serves as the primary utility layer for both the new evaluator framework (evals 2.0) and backward compatibility with the legacy system (evals 1.0). The module handles deprecated API pattern migration, JSONPath-based data extraction, evaluation input remapping, and annotation dataframe formatting.

Description

The module contains several functional areas:

Deprecation Decorators

  • _deprecate_positional_args(func_name): A decorator factory that issues DeprecationWarning whenever a function is called with positional arguments. Used to migrate APIs from positional to keyword-only argument style.
  • _deprecate_source_and_heuristic(func): A decorator that handles the migration from the deprecated source parameter to kind, and from the deprecated "heuristic" kind value to "code". Prevents silent override when both source and kind are provided with differing values.

Input Remapping

  • _bind_mapping_function(mapping_function, eval_input): Intelligently binds evaluation input values to a mapping function's parameters by name. Functions with 0-1 parameters receive the entire eval_input dict (legacy behavior); functions with multiple parameters get individual values matched by parameter name.
  • remap_eval_input(eval_input, required_fields, input_mapping): The core input remapping function that transforms evaluation inputs according to a mapping specification. Supports:
    • String mappings: Direct key lookups or JSONPath expressions
    • Callable mappings: Custom extraction functions
    • Pass-through: Unmapped keys from the input are passed through to the output
    • Required field validation: Ensures all required fields are present
  • extract_with_jsonpath(data, path, match_all): Extracts values from nested JSON structures using the jsonpath-ng library. Supports both single-match and multi-match modes.

Annotation Data Formatting

  • _merge_metadata_with_direction(score_data): Merges score metadata with a direction field from score data.
  • _format_score_data(dataframe, span_id_cols, score_name, score_display_name): Parses JSON-encoded score columns, extracts score/label/explanation/kind fields, infers the annotator kind (LLM, CODE, or HUMAN), and formats the data for Phoenix annotation logging.
  • to_annotation_dataframe(dataframe, score_names): Top-level function that converts evaluation results into a Phoenix-compatible annotation dataframe. Auto-detects score columns (those ending with _score) and span ID columns, handles both column-based and index-based span IDs, and concatenates results across multiple score names.

Type Definitions

  • InputMappingType: Type alias Optional[Mapping[str, Union[str, Callable[[Mapping[str, Any]], Any]]]] used for evaluation input mapping specifications.

Legacy Re-Exports

The module re-exports symbols from phoenix.evals.legacy.utils for backward compatibility, including: NOT_PARSABLE, SUPPORTED_AUDIO_FORMATS, SUPPORTED_IMAGE_FORMATS, snap_to_rail, printif, parse_openai_function_call, openai_function_call_kwargs, get_tqdm_progress_bar_formatter, get_image_format_from_base64, get_audio_format_from_base64, emoji_guard, and download_benchmark_dataset.

Usage

from phoenix.evals.utils import remap_eval_input

# Simple key remapping
eval_input = {"context": "Paris is the capital of France.", "question": "What is the capital?"}
remapped = remap_eval_input(
    eval_input=eval_input,
    required_fields={"input_text", "query"},
    input_mapping={"input_text": "context", "query": "question"},
)
# remapped = {"input_text": "Paris is the capital...", "query": "What is the capital?"}
# JSONPath extraction
from phoenix.evals.utils import extract_with_jsonpath

data = {"response": {"choices": [{"text": "Hello"}]}}
value = extract_with_jsonpath(data, "response.choices[0].text")
# value = "Hello"
# Converting eval results to annotation dataframe
from phoenix.evals.utils import to_annotation_dataframe

annotations = to_annotation_dataframe(eval_results_df, score_names=["hallucination"])
# Can then be logged: client.spans.log_span_annotations_dataframe(dataframe=annotations)

Code Reference

Symbol Kind Location Lines
InputMappingType Type Alias packages/phoenix-evals/src/phoenix/evals/utils.py 29
_deprecate_positional_args() Decorator Factory packages/phoenix-evals/src/phoenix/evals/utils.py 32-57
_deprecate_source_and_heuristic() Decorator packages/phoenix-evals/src/phoenix/evals/utils.py 60-105
_bind_mapping_function() Function packages/phoenix-evals/src/phoenix/evals/utils.py 109-146
remap_eval_input() Function packages/phoenix-evals/src/phoenix/evals/utils.py 149-246
extract_with_jsonpath() Function packages/phoenix-evals/src/phoenix/evals/utils.py 249-269
_format_score_data() Function packages/phoenix-evals/src/phoenix/evals/utils.py 298-382
to_annotation_dataframe() Function packages/phoenix-evals/src/phoenix/evals/utils.py 385-466
default_tqdm_progress_bar_formatter() Function packages/phoenix-evals/src/phoenix/evals/utils.py 469-481

I/O Contract

remap_eval_input()

Direction Type Description
Input Mapping[str, Any] The evaluation input dictionary to remap
Input Set[str] Required field names that must be present in the output
Input Optional[InputMappingType] Mapping specification: field name to source key (str) or extraction function (callable)
Output Dict[str, Any] Remapped dictionary with required fields populated and unmapped keys passed through
Raises ValueError If a required field is missing from the input
Raises TypeError If a mapping value is not a string or callable

to_annotation_dataframe()

Direction Type Description
Input pd.DataFrame DataFrame returned by evaluate_dataframe() with score columns ending in _score
Input Optional[List[str]] Score names to process. If None, auto-detects all _score columns.
Output pd.DataFrame Annotation dataframe with columns: span_id(s), score, label, explanation, metadata, annotation_name, annotator_kind
Raises ValueError If no column containing span_id is found, or if a score column is missing

extract_with_jsonpath()

Direction Type Description
Input Mapping[str, Any] The data dictionary to extract from
Input str A JSONPath expression
Input bool If True, return all matches as a list; default returns first match
Output Any The extracted value (or list of values if match_all=True)
Raises ValueError If the JSONPath matches no elements
Raises JsonPathParserError If the JSONPath expression has invalid syntax

Usage Examples

Callable Input Mapping

from phoenix.evals.utils import remap_eval_input

# Using a callable to compute derived fields
eval_input = {"documents": ["doc1", "doc2", "doc3"]}
remapped = remap_eval_input(
    eval_input=eval_input,
    required_fields={"context"},
    input_mapping={"context": lambda docs: "\n".join(docs["documents"])},
)
# remapped = {"context": "doc1\ndoc2\ndoc3", "documents": [...]}

Multi-Parameter Mapping Functions

# Functions with named parameters are bound by name from eval_input
def combine_fields(question: str, context: str) -> str:
    return f"Q: {question}\nC: {context}"

eval_input = {"question": "What?", "context": "Paris is..."}
remapped = remap_eval_input(
    eval_input=eval_input,
    required_fields={"combined"},
    input_mapping={"combined": combine_fields},
)
# remapped = {"combined": "Q: What?\nC: Paris is..."}

Annotation Dataframe Pipeline

from phoenix.client import Client
from phoenix.evals.utils import to_annotation_dataframe

# Full pipeline: evaluate then log annotations
annotations = to_annotation_dataframe(results_df, ["hallucination", "relevance"])
client = Client()
client.spans.log_span_annotations_dataframe(dataframe=annotations)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment