Implementation:BerriAI Litellm PagerDuty Alerting
| Attribute | Value |
|---|---|
| Sources | enterprise/litellm_enterprise/enterprise_callbacks/pagerduty/pagerduty.py |
| Domains | Alerting, Monitoring, Enterprise Callbacks |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
PagerDutyAlerting is an enterprise callback integration that sends critical alerts to PagerDuty when LLM API failure rates or hanging request counts exceed configurable thresholds within sliding time windows.
Description
The PagerDutyAlerting class extends SlackAlerting and provides two distinct alert types:
- High LLM API Failure Rate -- Triggers a PagerDuty alert when the number of failed LLM API responses exceeds a configurable threshold within a time window (default: 60 failures in 60 seconds).
- High Number of Hanging LLM Requests -- Triggers a PagerDuty alert when the number of requests that do not complete within a configurable timeout exceed a threshold (default: 60-second hang detection within a 600-second window).
The class maintains separate in-memory event lists for failures and hanging requests. Events are pruned based on time windows before threshold evaluation. When a threshold is crossed, a critical-severity alert is dispatched to the PagerDuty Events API v2 and the event list is cleared to avoid repeated alerts.
Requires the PAGERDUTY_API_KEY environment variable to be set.
Usage
Import and instantiate PagerDutyAlerting when you need to monitor LLM proxy health and receive PagerDuty incident notifications for API failures or hanging requests. It is registered as a custom callback in the LiteLLM proxy configuration.
Code Reference
Source Location
enterprise/litellm_enterprise/enterprise_callbacks/pagerduty/pagerduty.py
Signature
class PagerDutyAlerting(SlackAlerting):
def __init__(
self, alerting_args: Optional[Union[AlertingConfig, dict]] = None, **kwargs
): ...
async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time): ...
async def async_pre_call_hook(
self, user_api_key_dict: UserAPIKeyAuth, cache: DualCache, data: dict, call_type: CallTypesLiteral
) -> Optional[Union[Exception, str, dict]]: ...
async def hanging_response_handler(self, request_data: Optional[dict], user_api_key_dict: UserAPIKeyAuth): ...
async def send_alert_to_pagerduty(self, alert_message: str, custom_details: dict): ...
Import
from litellm_enterprise.enterprise_callbacks.pagerduty.pagerduty import PagerDutyAlerting
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
alerting_args |
Optional[Union[AlertingConfig, dict]] |
Configuration for failure/hanging thresholds and time windows. |
PAGERDUTY_API_KEY (env) |
str |
PagerDuty Events API v2 routing key (environment variable). |
AlertingConfig fields:
| Field | Type | Default | Description |
|---|---|---|---|
failure_threshold |
int |
60 | Number of failures to trigger alert. |
failure_threshold_window_seconds |
int |
60 | Time window for counting failures. |
hanging_threshold_seconds |
int |
60 | Seconds before a request is considered hanging. |
hanging_threshold_window_seconds |
int |
600 | Time window for counting hanging requests. |
Outputs
| Output | Type | Description |
|---|---|---|
| PagerDuty API response | httpx.Response |
HTTP response from https://events.pagerduty.com/v2/enqueue.
|
Usage Examples
# In LiteLLM proxy config YAML
litellm_settings:
callbacks:
- pagerduty
pagerduty_alerting_args:
failure_threshold: 100
failure_threshold_window_seconds: 120
hanging_threshold_seconds: 30
hanging_threshold_window_seconds: 300
# Programmatic usage
from litellm_enterprise.enterprise_callbacks.pagerduty.pagerduty import PagerDutyAlerting
alerter = PagerDutyAlerting(
alerting_args={
"failure_threshold": 50,
"failure_threshold_window_seconds": 60,
"hanging_threshold_seconds": 45,
"hanging_threshold_window_seconds": 300,
}
)
Related Pages
- BerriAI_Litellm_Callback_Controls -- Dynamic callback enable/disable controls
- BerriAI_Litellm_Base_Email_Alerting -- Email-based alerting for budget events