Implementation:Explodinggradients Ragas DSPyOptimizer Class
DSPyOptimizer Class
DSPyOptimizer is the concrete implementation of DSPy Prompt Optimization in the Ragas evaluation toolkit. It wraps DSPy's MIPROv2 teleprompter to optimize evaluation metric prompts using instruction and demonstration search.
Source Location
- File:
src/ragas/optimizers/dspy_optimizer.py - Class definition: Lines 18-314
optimizemethod: Lines 117-237
Import
from ragas.optimizers import DSPyOptimizer
Or directly:
from ragas.optimizers.dspy_optimizer import DSPyOptimizer
Note: The import from ragas.optimizers is wrapped in a try/except block. If dspy is not installed, DSPyOptimizer will not be available.
External Dependency
This class requires the dspy package. Install it with:
uv add 'ragas[dspy]'
# or
pip install 'ragas[dspy]'
The import is validated in __post_init__ (line 76-85); an ImportError is raised with installation instructions if dspy is not found.
Class Hierarchy
Optimizer (ABC, dataclass)
└── DSPyOptimizer (dataclass)
Constructor
optimizer = DSPyOptimizer(
metric=my_metric, # Optional[MetricWithLLM], inherited from Optimizer
llm=my_llm, # Optional[BaseRagasLLM], inherited from Optimizer
num_candidates=10, # int
max_bootstrapped_demos=5, # int
max_labeled_demos=5, # int
init_temperature=1.0, # float
auto="light", # Optional[Literal["light", "medium", "heavy"]]
num_threads=None, # Optional[int]
max_errors=None, # Optional[int]
seed=9, # int
verbose=False, # bool
track_stats=True, # bool
log_dir=None, # Optional[str]
metric_threshold=None, # Optional[float]
cache=None, # Optional[CacheInterface]
)
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
metric |
Optional[MetricWithLLM] |
None |
The metric whose prompts will be optimized (inherited). |
llm |
Optional[BaseRagasLLM] |
None |
The language model used for optimization (inherited). |
num_candidates |
int |
10 | Number of prompt variants to try during optimization. |
max_bootstrapped_demos |
int |
5 | Maximum auto-generated examples to use as demonstrations. |
max_labeled_demos |
int |
5 | Maximum human-annotated examples to use as demonstrations. |
init_temperature |
float |
1.0 | Exploration temperature for optimization. |
auto |
Optional[Literal["light", "medium", "heavy"]] |
"light" |
Automatic configuration level controlling search depth. |
num_threads |
Optional[int] |
None |
Number of parallel threads for optimization. |
max_errors |
Optional[int] |
None |
Maximum errors tolerated before stopping. |
seed |
int |
9 | Random seed for reproducibility. |
verbose |
bool |
False |
Enable verbose logging during optimization. |
track_stats |
bool |
True |
Track and report optimization statistics. |
log_dir |
Optional[str] |
None |
Directory for saving optimization logs. |
metric_threshold |
Optional[float] |
None |
Minimum acceptable metric value (must be between 0 and 1). |
cache |
Optional[CacheInterface] |
None |
Cache backend for storing optimization results. |
Parameter Validation
The _validate_parameters method (lines 89-115) enforces:
num_candidatesmust be positive.max_bootstrapped_demosmust be non-negative.max_labeled_demosmust be non-negative.init_temperaturemust be positive.automust be one of"light","medium","heavy", orNone.num_threadsmust be positive if specified.max_errorsmust be non-negative if specified.metric_thresholdmust be between 0 and 1 if specified.
optimize() Method
Signature
def optimize(
self,
dataset: SingleMetricAnnotation,
loss: Loss,
config: Dict[Any, Any],
run_config: Optional[RunConfig] = None,
batch_size: Optional[int] = None,
callbacks: Optional[Callbacks] = None,
with_debugging_logs: bool = False,
raise_exceptions: bool = True,
) -> Dict[str, str]
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
dataset |
SingleMetricAnnotation |
required | Annotated dataset with ground truth scores. |
loss |
Loss |
required | Loss function to optimize against. |
config |
Dict[Any, Any] |
required | Additional configuration parameters. |
run_config |
Optional[RunConfig] |
None |
Runtime configuration. |
batch_size |
Optional[int] |
None |
Batch size for evaluation. |
callbacks |
Optional[Callbacks] |
None |
LangChain callbacks for tracking. |
with_debugging_logs |
bool |
False |
Enable debug logging. |
raise_exceptions |
bool |
True |
Whether to raise exceptions during optimization. |
Return Value
Returns Dict[str, str] mapping each prompt name to its optimized instruction string.
Internal Pipeline
The optimize method executes these steps for each prompt in the metric:
- Cache check -- If a cache backend is configured, checks for a prior result (lines 173-179).
- Import DSPy adapter utilities -- Lazy import of
ragas.optimizers.dspy_adapterfunctions (lines 183-188). - Setup DSPy LLM -- Configures DSPy's global LLM setting from the Ragas LLM (line 190).
- Convert prompt to DSPy Signature -- Translates the
PydanticPromptschema into a DSPySignature(line 198). - Create DSPy module -- Instantiates
dspy.Predict(signature)(line 199). - Convert dataset to DSPy examples -- Transforms
SingleMetricAnnotationinto DSPyExampleobjects (line 200). - Configure MIPROv2 -- Creates the teleprompter with the optimizer's parameters (lines 202-215).
- Create DSPy metric -- Wraps the Ragas loss function as a DSPy-compatible metric (line 217).
- Compile -- Runs
teleprompter.compile(module, trainset=examples, metric=metric_fn)(lines 219-223). - Extract instruction -- Retrieves the optimized instruction from the compiled module (line 225).
- Cache store -- If caching is enabled, stores the result (lines 232-235).
Helper Methods
_extract_instruction (Lines 239-263)
def _extract_instruction(self, optimized_module: Any) -> str
Extracts the optimized instruction string from the DSPy compiled module by checking for signature.instructions, signature.__doc__, or extended_signature.
_generate_cache_key (Lines 265-314)
def _generate_cache_key(
self,
dataset: SingleMetricAnnotation,
loss: Loss,
config: Dict[Any, Any],
) -> str
Generates a SHA256 hash from the metric name, dataset hash, loss class name, and all optimizer parameters to create a unique cache key.
Usage Example
from ragas.optimizers import DSPyOptimizer
from ragas.losses import BinaryMetricLoss
from ragas.dataset_schema import SingleMetricAnnotation
# Load annotated data
annotations = SingleMetricAnnotation.from_json("annotations.json")
# Create optimizer (requires dspy to be installed)
optimizer = DSPyOptimizer(
metric=my_metric,
llm=my_llm,
num_candidates=10,
auto="light",
)
# Run optimization
best_prompts = optimizer.optimize(
dataset=annotations,
loss=BinaryMetricLoss(metric="accuracy"),
config={},
)
# Apply optimized prompts
prompts = my_metric.get_prompts()
for name, instruction in best_prompts.items():
prompts[name].instruction = instruction
my_metric.set_prompts(**prompts)
Implements
See Also
- Principle:Explodinggradients_Ragas_DSPy_Prompt_Optimization
- GeneticOptimizer Class -- Alternative evolutionary optimizer.
- Loss Classes -- Fitness/loss functions used during optimization.
- MetricAnnotation Class -- Annotation data format.
- PromptMixin Save/Load -- Persisting optimized prompts.
- Environment:Explodinggradients_Ragas_LLM_Provider_Environment
- Environment:Explodinggradients_Ragas_Optional_Metrics_Environment