Environment:Evidentlyai Evidently LLM Evaluation Environment
| Knowledge Sources | |
|---|---|
| Domains | LLMs, ML_Monitoring |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
Optional environment extending core Evidently with OpenAI, LiteLLM, Transformers, and sentence-transformers for LLM-based evaluation descriptors.
Description
This environment adds LLM provider integrations to Evidently, enabling LLM-as-a-judge descriptors, semantic similarity computation, and HuggingFace model-based text evaluation features. It supports multiple LLM providers (OpenAI, Anthropic, Gemini, Mistral, DeepSeek, Ollama, and many more via LiteLLM) with built-in rate limiting per provider.
Usage
Use this environment when running LLM-based evaluation descriptors such as `LLMJudge`, `BinaryClassificationLLMJudge`, `ContextRelevance`, or when using `BERTScore`, `SemanticSimilarity`, or HuggingFace model features. Required for the `TextEvals` preset with LLM descriptors and the `Evidentlyai_Evidently_LLM_Evaluation_Monitoring` workflow.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Any (Linux, macOS, Windows) | Same as core environment |
| Python | >= 3.10 | Same as core environment |
| Hardware | CPU (GPU optional) | GPU accelerates local transformer models but is not required for API-based LLM calls |
| Network | Internet access | Required for OpenAI, Anthropic, and other cloud LLM API calls |
Dependencies
Python Packages
- `openai` >= 1.16.2
- `evaluate` >= 0.4.1
- `transformers[torch]` >= 4.39.3
- `sentence-transformers` >= 2.7.0
- `sqlvalidator` >= 0.0.20
- `litellm` >= 1.74.3
- `llama-index` >= 0.10
- `faiss-cpu` >= 1.8.0
Credentials
The following credentials are required depending on the LLM provider used:
- OpenAI: Pass `api_key` via `OpenAIKey` options or set the standard `OPENAI_API_KEY` environment variable.
- Anthropic: Pass `api_key` via `AnthropicOptions`.
- Vertex AI: Pass `api_key` (JSON credentials string) via `VertexAIOptions`.
- Ollama: No API key needed, but `api_url` is required.
- Other providers: Pass `api_key` via the corresponding `{Provider}Options` class.
Quick Install
# Install Evidently with LLM support
pip install evidently[llm]
Code Evidence
LiteLLM fallback from `src/evidently/llm/utils/wrapper.py:438-442`:
if find_spec("litellm") is not None:
litellm_wrapper = get_litellm_wrapper(provider, model, options)
if litellm_wrapper is not None:
return litellm_wrapper
raise ValueError(f"LLM wrapper for provider {provider} model {model} not found. Try installing litellm")
OpenAI rate limit defaults from `src/evidently/llm/utils/wrapper.py:500`:
class OpenAIKey(LLMOptions):
__provider_name__: ClassVar[str] = "openai"
limits: RateLimits = RateLimits(rpm=500)
Anthropic rate limit defaults from `src/evidently/llm/utils/wrapper.py:621-623`:
class AnthropicOptions(LLMOptions):
__provider_name__: ClassVar = "anthropic"
limits: RateLimits = RateLimits(
rpm=50 // 12, itpm=40000 // 12, otpm=8000 // 12, interval=datetime.timedelta(seconds=5)
)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `LLM wrapper for provider {X} model {Y} not found. Try installing litellm` | Provider not natively supported and litellm not installed | `pip install litellm` or use a natively supported provider (openai, anthropic) |
| `ImportError` on `openai` | openai package not installed | `pip install evidently[llm]` |
| `LLMRateLimitError` | Exceeded API rate limits | Configure `RateLimits` in provider options to match your API tier |
Compatibility Notes
- Provider support: OpenAI is the only natively implemented provider. All other providers (Anthropic, Gemini, Mistral, etc.) are routed through LiteLLM, requiring the `litellm` package.
- Excluded LiteLLM providers: Several LiteLLM providers are explicitly excluded: `openai_like`, `custom_openai`, `text-completion-openai`, `anthropic_text`, `huggingface`, `vertex_ai_beta`, `azure_text`, `sagemaker_chat`, `ollama_chat`, `text-completion-codestral`, `watsonx_text`, `custom`, `aiohttp_openai`.
- Async-only: The LLM wrapper uses async/await internally. Synchronous wrappers (`complete_batch_sync`, `run_sync`, `run_batch_sync`) are provided for non-async contexts.