Workflow:Arize ai Phoenix Prompt Management Pipeline

Knowledge Sources	Arize Phoenix Phoenix Prompt Management Docs
Domains	AI_Observability, Prompt_Engineering, LLM_Ops
Last Updated	2026-02-14 06:00 GMT

Overview

End-to-end process for creating, versioning, tagging, and retrieving LLM prompts using the Phoenix prompt management system.

Description

This workflow covers the complete prompt management lifecycle in Phoenix. Prompts are versioned, collaborative entities that store message templates, model configurations, and tool definitions. The system supports both f-string and Mustache template formats with variable interpolation. Each prompt maintains a linear revision history (similar to git commits), and versions can be tagged with human-readable labels (e.g., "production", "staging") for environment-based retrieval. Prompts can be created and managed via the Phoenix UI or the Python client SDK.

Key capabilities:

Version-controlled prompt templates with linear revision history
Support for Chat and String prompt types
F-string and Mustache template variable formats
Optional model configuration (name, provider, parameters) stored alongside templates
Tool definitions and structured output schemas
Human-readable tags for environment-based versioning (e.g., "dev", "prod")
Client-side caching for resilient production usage
Analytics tracking (usage counts, linked traces)

Usage

Execute this workflow when you need a centralized, version-controlled system for managing LLM prompts. Common scenarios include: collaborating on prompt development with a team, managing prompt versions across development and production environments, tracking which prompt versions are used in production traces, and systematically iterating on prompts using the Phoenix playground and experiments.

Execution Steps

Step 1: Create a Prompt

Create a new prompt in Phoenix with a human-readable name, template content, and optional model configuration. The template can contain variables in either f-string format (e.g., {name}) or Mustache format (e.g., Template:Name). For Chat prompt types, the template is a list of messages with roles and content.

Key considerations:

Choose a descriptive, unique name that serves as the prompt identifier
Select the appropriate prompt type: Chat (most common) or String
Define template variables that will be filled at runtime
Optionally associate a model configuration (model name, provider, temperature, max_tokens)
Optionally define tool schemas and structured output schemas
The first version is automatically created when the prompt is created

Step 2: Iterate and Version

Make changes to the prompt and save new versions. Each save creates a new revision in the linear history, allowing you to track changes over time and revert to previous versions if needed. Use the Phoenix playground to test prompt variations against different models and inputs.

Key considerations:

Each modification creates a new version (commit) in the linear history
Previous versions remain accessible by version ID
The playground allows testing prompts against multiple models and datasets
Compare output quality between versions before promoting changes
Prompt templates can be forked to create independent copies with shared history

Step 3: Tag Versions

Apply human-readable tags to specific prompt versions to create named references. Tags provide stable identifiers that remain constant even as new versions are created, making them ideal for environment-based deployment (e.g., "production" always points to the vetted version).

Key considerations:

Common tag patterns: "dev", "staging", "production", "latest"
Tags can be moved to point to different versions as you promote changes
Retrieving a prompt by tag returns the tagged version regardless of newer versions
Tags provide a safety mechanism for production systems: update the tag only after validation

Step 4: Retrieve Prompts in Code

Use the Phoenix client SDK to retrieve prompt versions programmatically. Prompts can be fetched by name (latest version), by specific version ID, or by tag. The client supports caching for resilient production usage.

Key considerations:

Retrieve by name: client.prompts.get(prompt_identifier="my-prompt") returns latest version
Retrieve by tag: client.prompts.get(prompt_identifier="my-prompt", tag="production")
Retrieve by version: client.prompts.get(prompt_version_id="version-123")
Client-side caching provides fallback when the server is temporarily unavailable
The returned PromptVersion object contains the template, model config, and metadata

Step 5: Use Prompts in Applications

Integrate the retrieved prompt template into your LLM application. Render the template by filling in variable values, then pass the rendered messages to the LLM. Phoenix automatically tracks prompt usage through trace context attributes, linking traces back to the specific prompt version used.

Key considerations:

F-string templates are rendered using Python string formatting
Mustache templates support conditionals, loops, and nested keys for complex rendering
The rendered prompt is passed to the LLM provider as chat messages or a completion prompt
Set context attributes on spans to link traces to the prompt version for analytics
Monitor prompt performance over time through linked trace metrics

Execution Diagram

GitHub URL

Workflow Repository