Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlflow Mlflow Load Prompt

From Leeroopedia
Revision as of 13:18, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Mlflow_Mlflow_Load_Prompt.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains ML_Ops, Prompt_Engineering
Last Updated 2026-02-13 20:00 GMT

Overview

Concrete tool for loading a versioned prompt from the MLflow Prompt Registry and formatting its template with variable substitution, provided by the MLflow library.

Description

The mlflow.genai.load_prompt() function retrieves a specific prompt version from the MLflow Prompt Registry. It supports multiple addressing modes: by name with an explicit version, by URI (prompts:/name/version), by alias URI (prompts:/name@alias), or by the special @latest alias. The function returns a PromptVersion entity whose format() method performs variable substitution to produce the final prompt text.

The load_prompt() function includes built-in caching with a configurable TTL. For alias-based loads, the default TTL is 60 seconds (controlled by the MLFLOW_ALIAS_PROMPT_CACHE_TTL_SECONDS environment variable). For version-based loads, there is no default TTL since specific versions are immutable. Setting cache_ttl_seconds=0 bypasses the cache entirely.

The PromptVersion.format() method handles variable substitution. For templates using double-brace syntax ({{var}}), it performs direct replacement. For templates using Jinja2 control flow (Template:% %), it invokes the Jinja2 rendering engine with optional sandboxing. When allow_partial=True, missing variables are preserved as placeholders in the returned PromptVersion rather than raising an error.

Usage

Use load_prompt() in application code to retrieve prompts at runtime. Use the format() method on the returned PromptVersion to substitute variables before passing the result to an LLM API.

Code Reference

Source Location

  • Repository: mlflow
  • File (load_prompt): mlflow/genai/prompts/__init__.py
  • Lines: L155-218
  • File (format): mlflow/entities/model_registry/prompt_version.py
  • Lines: L450-573

Signature

# load_prompt
def load_prompt(
    name_or_uri: str,
    version: str | int | None = None,
    allow_missing: bool = False,
    link_to_model: bool = True,
    model_id: str | None = None,
    cache_ttl_seconds: float | None = None,
) -> PromptVersion:
    ...

# PromptVersion.format
def format(
    self,
    allow_partial: bool = False,
    use_jinja_sandbox: bool = True,
    **kwargs,
) -> PromptVersion | str | list[dict[str, Any]]:
    ...

Import

import mlflow.genai

# Then call:
# prompt = mlflow.genai.load_prompt(...)
# result = prompt.format(...)

I/O Contract

Inputs (load_prompt)

Name Type Required Description
name_or_uri str Yes The prompt name (e.g., "my_prompt") or a URI (e.g., "prompts:/my_prompt/1", "prompts:/my_prompt@production", "prompts:/my_prompt@latest").
version int | None No The version number. Required when using a plain name; not allowed when using a URI that already includes the version or alias.
allow_missing bool No If True, return None instead of raising an exception when the prompt is not found. Defaults to False.
link_to_model bool No If True, link the prompt to the current model. Defaults to True.
model_id None No The ID of the model to link the prompt to. Only used if link_to_model is True.
cache_ttl_seconds None No Time-to-live in seconds for the cached prompt. Defaults to 60s for alias-based loads, None (no TTL) for version-based loads. Set to 0 to bypass cache.

Inputs (format)

Name Type Required Description
allow_partial bool No If True, return a new PromptVersion with remaining placeholders when variables are missing. Defaults to False (raises error on missing variables).
use_jinja_sandbox bool No If True, use Jinja2 SandboxedEnvironment for templates with control flow syntax. Defaults to True.
**kwargs Any No Keyword arguments providing values for the template variables.

Outputs

Name Type Description
load_prompt return PromptVersion The loaded prompt version entity with template, metadata, variables, tags, and aliases.
format() return (complete) list[dict[str, Any]] For text prompts, a fully formatted string. For chat prompts, a list of formatted message dictionaries.
format() return (partial) PromptVersion When allow_partial=True and variables are missing, a new PromptVersion with the supplied variables filled and remaining placeholders intact.

Usage Examples

Load by Name and Version

import mlflow.genai

# Load a specific version
prompt = mlflow.genai.load_prompt("my_prompt", version=1)
result = prompt.format(style="friendly")
print(result)  # Formatted text with {{style}} replaced by "friendly"

Load by URI with Alias

import mlflow.genai

# Load the production version via alias
prompt = mlflow.genai.load_prompt("prompts:/my_prompt@production")

# Load the latest version
prompt = mlflow.genai.load_prompt("prompts:/my_prompt@latest")

# Load a specific version by URI
prompt = mlflow.genai.load_prompt("prompts:/my_prompt/3")

Format and Use with LLM

import mlflow.genai
import openai

# Load and format
prompt = mlflow.genai.load_prompt("prompts:/greeting@production")
system_message = prompt.format(style="friendly")

# Use with OpenAI
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": "Hello!"},
    ],
)

Partial Formatting

import mlflow.genai

prompt = mlflow.genai.load_prompt("my_prompt", version=1)

# Fill system-level variables first
partial = prompt.format(system_context="You are helpful.", allow_partial=True)

# Fill user-specific variables later
final = partial.format(user_query="What is MLflow?")

Custom Cache TTL

import mlflow.genai

# Cache for 5 minutes
prompt = mlflow.genai.load_prompt(
    "prompts:/my_prompt@production",
    cache_ttl_seconds=300,
)

# Bypass cache entirely
prompt = mlflow.genai.load_prompt(
    "prompts:/my_prompt@production",
    cache_ttl_seconds=0,
)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment