Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas DSPyOptimizer Class

From Leeroopedia


DSPyOptimizer Class

DSPyOptimizer is the concrete implementation of DSPy Prompt Optimization in the Ragas evaluation toolkit. It wraps DSPy's MIPROv2 teleprompter to optimize evaluation metric prompts using instruction and demonstration search.

Source Location

  • File: src/ragas/optimizers/dspy_optimizer.py
  • Class definition: Lines 18-314
  • optimize method: Lines 117-237

Import

from ragas.optimizers import DSPyOptimizer

Or directly:

from ragas.optimizers.dspy_optimizer import DSPyOptimizer

Note: The import from ragas.optimizers is wrapped in a try/except block. If dspy is not installed, DSPyOptimizer will not be available.

External Dependency

This class requires the dspy package. Install it with:

uv add 'ragas[dspy]'
# or
pip install 'ragas[dspy]'

The import is validated in __post_init__ (line 76-85); an ImportError is raised with installation instructions if dspy is not found.

Class Hierarchy

Optimizer (ABC, dataclass)
  └── DSPyOptimizer (dataclass)

Constructor

optimizer = DSPyOptimizer(
    metric=my_metric,                # Optional[MetricWithLLM], inherited from Optimizer
    llm=my_llm,                      # Optional[BaseRagasLLM], inherited from Optimizer
    num_candidates=10,               # int
    max_bootstrapped_demos=5,        # int
    max_labeled_demos=5,             # int
    init_temperature=1.0,            # float
    auto="light",                    # Optional[Literal["light", "medium", "heavy"]]
    num_threads=None,                # Optional[int]
    max_errors=None,                 # Optional[int]
    seed=9,                          # int
    verbose=False,                   # bool
    track_stats=True,                # bool
    log_dir=None,                    # Optional[str]
    metric_threshold=None,           # Optional[float]
    cache=None,                      # Optional[CacheInterface]
)

Constructor Parameters

Parameter Type Default Description
metric Optional[MetricWithLLM] None The metric whose prompts will be optimized (inherited).
llm Optional[BaseRagasLLM] None The language model used for optimization (inherited).
num_candidates int 10 Number of prompt variants to try during optimization.
max_bootstrapped_demos int 5 Maximum auto-generated examples to use as demonstrations.
max_labeled_demos int 5 Maximum human-annotated examples to use as demonstrations.
init_temperature float 1.0 Exploration temperature for optimization.
auto Optional[Literal["light", "medium", "heavy"]] "light" Automatic configuration level controlling search depth.
num_threads Optional[int] None Number of parallel threads for optimization.
max_errors Optional[int] None Maximum errors tolerated before stopping.
seed int 9 Random seed for reproducibility.
verbose bool False Enable verbose logging during optimization.
track_stats bool True Track and report optimization statistics.
log_dir Optional[str] None Directory for saving optimization logs.
metric_threshold Optional[float] None Minimum acceptable metric value (must be between 0 and 1).
cache Optional[CacheInterface] None Cache backend for storing optimization results.

Parameter Validation

The _validate_parameters method (lines 89-115) enforces:

  • num_candidates must be positive.
  • max_bootstrapped_demos must be non-negative.
  • max_labeled_demos must be non-negative.
  • init_temperature must be positive.
  • auto must be one of "light", "medium", "heavy", or None.
  • num_threads must be positive if specified.
  • max_errors must be non-negative if specified.
  • metric_threshold must be between 0 and 1 if specified.

optimize() Method

Signature

def optimize(
    self,
    dataset: SingleMetricAnnotation,
    loss: Loss,
    config: Dict[Any, Any],
    run_config: Optional[RunConfig] = None,
    batch_size: Optional[int] = None,
    callbacks: Optional[Callbacks] = None,
    with_debugging_logs: bool = False,
    raise_exceptions: bool = True,
) -> Dict[str, str]

Parameters

Parameter Type Default Description
dataset SingleMetricAnnotation required Annotated dataset with ground truth scores.
loss Loss required Loss function to optimize against.
config Dict[Any, Any] required Additional configuration parameters.
run_config Optional[RunConfig] None Runtime configuration.
batch_size Optional[int] None Batch size for evaluation.
callbacks Optional[Callbacks] None LangChain callbacks for tracking.
with_debugging_logs bool False Enable debug logging.
raise_exceptions bool True Whether to raise exceptions during optimization.

Return Value

Returns Dict[str, str] mapping each prompt name to its optimized instruction string.

Internal Pipeline

The optimize method executes these steps for each prompt in the metric:

  1. Cache check -- If a cache backend is configured, checks for a prior result (lines 173-179).
  2. Import DSPy adapter utilities -- Lazy import of ragas.optimizers.dspy_adapter functions (lines 183-188).
  3. Setup DSPy LLM -- Configures DSPy's global LLM setting from the Ragas LLM (line 190).
  4. Convert prompt to DSPy Signature -- Translates the PydanticPrompt schema into a DSPy Signature (line 198).
  5. Create DSPy module -- Instantiates dspy.Predict(signature) (line 199).
  6. Convert dataset to DSPy examples -- Transforms SingleMetricAnnotation into DSPy Example objects (line 200).
  7. Configure MIPROv2 -- Creates the teleprompter with the optimizer's parameters (lines 202-215).
  8. Create DSPy metric -- Wraps the Ragas loss function as a DSPy-compatible metric (line 217).
  9. Compile -- Runs teleprompter.compile(module, trainset=examples, metric=metric_fn) (lines 219-223).
  10. Extract instruction -- Retrieves the optimized instruction from the compiled module (line 225).
  11. Cache store -- If caching is enabled, stores the result (lines 232-235).

Helper Methods

_extract_instruction (Lines 239-263)

def _extract_instruction(self, optimized_module: Any) -> str

Extracts the optimized instruction string from the DSPy compiled module by checking for signature.instructions, signature.__doc__, or extended_signature.

_generate_cache_key (Lines 265-314)

def _generate_cache_key(
    self,
    dataset: SingleMetricAnnotation,
    loss: Loss,
    config: Dict[Any, Any],
) -> str

Generates a SHA256 hash from the metric name, dataset hash, loss class name, and all optimizer parameters to create a unique cache key.

Usage Example

from ragas.optimizers import DSPyOptimizer
from ragas.losses import BinaryMetricLoss
from ragas.dataset_schema import SingleMetricAnnotation

# Load annotated data
annotations = SingleMetricAnnotation.from_json("annotations.json")

# Create optimizer (requires dspy to be installed)
optimizer = DSPyOptimizer(
    metric=my_metric,
    llm=my_llm,
    num_candidates=10,
    auto="light",
)

# Run optimization
best_prompts = optimizer.optimize(
    dataset=annotations,
    loss=BinaryMetricLoss(metric="accuracy"),
    config={},
)

# Apply optimized prompts
prompts = my_metric.get_prompts()
for name, instruction in best_prompts.items():
    prompts[name].instruction = instruction
my_metric.set_prompts(**prompts)

Implements

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment