Implementation:Explodinggradients Ragas DSPyOptimizer Class

DSPyOptimizer Class

DSPyOptimizer is the concrete implementation of DSPy Prompt Optimization in the Ragas evaluation toolkit. It wraps DSPy's MIPROv2 teleprompter to optimize evaluation metric prompts using instruction and demonstration search.

Source Location

File: src/ragas/optimizers/dspy_optimizer.py
Class definition: Lines 18-314
optimize method: Lines 117-237

Import

from ragas.optimizers import DSPyOptimizer

Or directly:

from ragas.optimizers.dspy_optimizer import DSPyOptimizer

Note: The import from ragas.optimizers is wrapped in a try/except block. If dspy is not installed, DSPyOptimizer will not be available.

External Dependency

This class requires the dspy package. Install it with:

uv add 'ragas[dspy]'
# or
pip install 'ragas[dspy]'

The import is validated in __post_init__ (line 76-85); an ImportError is raised with installation instructions if dspy is not found.

Class Hierarchy

Optimizer (ABC, dataclass)
  └── DSPyOptimizer (dataclass)

Constructor

optimizer = DSPyOptimizer(
    metric=my_metric,                # Optional[MetricWithLLM], inherited from Optimizer
    llm=my_llm,                      # Optional[BaseRagasLLM], inherited from Optimizer
    num_candidates=10,               # int
    max_bootstrapped_demos=5,        # int
    max_labeled_demos=5,             # int
    init_temperature=1.0,            # float
    auto="light",                    # Optional[Literal["light", "medium", "heavy"]]
    num_threads=None,                # Optional[int]
    max_errors=None,                 # Optional[int]
    seed=9,                          # int
    verbose=False,                   # bool
    track_stats=True,                # bool
    log_dir=None,                    # Optional[str]
    metric_threshold=None,           # Optional[float]
    cache=None,                      # Optional[CacheInterface]
)

Constructor Parameters

Parameter	Type	Default	Description
`metric`	`Optional[MetricWithLLM]`	`None`	The metric whose prompts will be optimized (inherited).
`llm`	`Optional[BaseRagasLLM]`	`None`	The language model used for optimization (inherited).
`num_candidates`	`int`	10	Number of prompt variants to try during optimization.
`max_bootstrapped_demos`	`int`	5	Maximum auto-generated examples to use as demonstrations.
`max_labeled_demos`	`int`	5	Maximum human-annotated examples to use as demonstrations.
`init_temperature`	`float`	1.0	Exploration temperature for optimization.
`auto`	`Optional[Literal["light", "medium", "heavy"]]`	`"light"`	Automatic configuration level controlling search depth.
`num_threads`	`Optional[int]`	`None`	Number of parallel threads for optimization.
`max_errors`	`Optional[int]`	`None`	Maximum errors tolerated before stopping.
`seed`	`int`	9	Random seed for reproducibility.
`verbose`	`bool`	`False`	Enable verbose logging during optimization.
`track_stats`	`bool`	`True`	Track and report optimization statistics.
`log_dir`	`Optional[str]`	`None`	Directory for saving optimization logs.
`metric_threshold`	`Optional[float]`	`None`	Minimum acceptable metric value (must be between 0 and 1).
`cache`	`Optional[CacheInterface]`	`None`	Cache backend for storing optimization results.

Parameter Validation

The _validate_parameters method (lines 89-115) enforces:

num_candidates must be positive.
max_bootstrapped_demos must be non-negative.
max_labeled_demos must be non-negative.
init_temperature must be positive.
auto must be one of "light", "medium", "heavy", or None.
num_threads must be positive if specified.
max_errors must be non-negative if specified.
metric_threshold must be between 0 and 1 if specified.

optimize() Method

Signature

def optimize(
    self,
    dataset: SingleMetricAnnotation,
    loss: Loss,
    config: Dict[Any, Any],
    run_config: Optional[RunConfig] = None,
    batch_size: Optional[int] = None,
    callbacks: Optional[Callbacks] = None,
    with_debugging_logs: bool = False,
    raise_exceptions: bool = True,
) -> Dict[str, str]

Parameters

Parameter	Type	Default	Description
`dataset`	`SingleMetricAnnotation`	required	Annotated dataset with ground truth scores.
`loss`	`Loss`	required	Loss function to optimize against.
`config`	`Dict[Any, Any]`	required	Additional configuration parameters.
`run_config`	`Optional[RunConfig]`	`None`	Runtime configuration.
`batch_size`	`Optional[int]`	`None`	Batch size for evaluation.
`callbacks`	`Optional[Callbacks]`	`None`	LangChain callbacks for tracking.
`with_debugging_logs`	`bool`	`False`	Enable debug logging.
`raise_exceptions`	`bool`	`True`	Whether to raise exceptions during optimization.

Return Value

Returns Dict[str, str] mapping each prompt name to its optimized instruction string.

Internal Pipeline

The optimize method executes these steps for each prompt in the metric:

Cache check -- If a cache backend is configured, checks for a prior result (lines 173-179).
Import DSPy adapter utilities -- Lazy import of ragas.optimizers.dspy_adapter functions (lines 183-188).
Setup DSPy LLM -- Configures DSPy's global LLM setting from the Ragas LLM (line 190).
Convert prompt to DSPy Signature -- Translates the PydanticPrompt schema into a DSPy Signature (line 198).
Create DSPy module -- Instantiates dspy.Predict(signature) (line 199).
Convert dataset to DSPy examples -- Transforms SingleMetricAnnotation into DSPy Example objects (line 200).
Configure MIPROv2 -- Creates the teleprompter with the optimizer's parameters (lines 202-215).
Create DSPy metric -- Wraps the Ragas loss function as a DSPy-compatible metric (line 217).
Compile -- Runs teleprompter.compile(module, trainset=examples, metric=metric_fn) (lines 219-223).
Extract instruction -- Retrieves the optimized instruction from the compiled module (line 225).
Cache store -- If caching is enabled, stores the result (lines 232-235).

Helper Methods

_extract_instruction (Lines 239-263)

def _extract_instruction(self, optimized_module: Any) -> str

Extracts the optimized instruction string from the DSPy compiled module by checking for signature.instructions, signature.__doc__, or extended_signature.

_generate_cache_key (Lines 265-314)

def _generate_cache_key(
    self,
    dataset: SingleMetricAnnotation,
    loss: Loss,
    config: Dict[Any, Any],
) -> str

Generates a SHA256 hash from the metric name, dataset hash, loss class name, and all optimizer parameters to create a unique cache key.

Usage Example

from ragas.optimizers import DSPyOptimizer
from ragas.losses import BinaryMetricLoss
from ragas.dataset_schema import SingleMetricAnnotation

# Load annotated data
annotations = SingleMetricAnnotation.from_json("annotations.json")

# Create optimizer (requires dspy to be installed)
optimizer = DSPyOptimizer(
    metric=my_metric,
    llm=my_llm,
    num_candidates=10,
    auto="light",
)

# Run optimization
best_prompts = optimizer.optimize(
    dataset=annotations,
    loss=BinaryMetricLoss(metric="accuracy"),
    config={},
)

# Apply optimized prompts
prompts = my_metric.get_prompts()
for name, instruction in best_prompts.items():
    prompts[name].instruction = instruction
my_metric.set_prompts(**prompts)

Implements

Principle: DSPy Prompt Optimization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment