Implementation:BerriAI Litellm PagerDuty Alerting

Attribute	Value
Sources	enterprise/litellm_enterprise/enterprise_callbacks/pagerduty/pagerduty.py
Domains	Alerting, Monitoring, Enterprise Callbacks
Last Updated	2026-02-15 16:00 GMT

Overview

PagerDutyAlerting is an enterprise callback integration that sends critical alerts to PagerDuty when LLM API failure rates or hanging request counts exceed configurable thresholds within sliding time windows.

Description

The PagerDutyAlerting class extends SlackAlerting and provides two distinct alert types:

High LLM API Failure Rate -- Triggers a PagerDuty alert when the number of failed LLM API responses exceeds a configurable threshold within a time window (default: 60 failures in 60 seconds).
High Number of Hanging LLM Requests -- Triggers a PagerDuty alert when the number of requests that do not complete within a configurable timeout exceed a threshold (default: 60-second hang detection within a 600-second window).

The class maintains separate in-memory event lists for failures and hanging requests. Events are pruned based on time windows before threshold evaluation. When a threshold is crossed, a critical-severity alert is dispatched to the PagerDuty Events API v2 and the event list is cleared to avoid repeated alerts.

Requires the PAGERDUTY_API_KEY environment variable to be set.

Usage

Import and instantiate PagerDutyAlerting when you need to monitor LLM proxy health and receive PagerDuty incident notifications for API failures or hanging requests. It is registered as a custom callback in the LiteLLM proxy configuration.

Code Reference

Source Location

enterprise/litellm_enterprise/enterprise_callbacks/pagerduty/pagerduty.py

Signature

class PagerDutyAlerting(SlackAlerting):
    def __init__(
        self, alerting_args: Optional[Union[AlertingConfig, dict]] = None, **kwargs
    ): ...

    async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time): ...
    async def async_pre_call_hook(
        self, user_api_key_dict: UserAPIKeyAuth, cache: DualCache, data: dict, call_type: CallTypesLiteral
    ) -> Optional[Union[Exception, str, dict]]: ...
    async def hanging_response_handler(self, request_data: Optional[dict], user_api_key_dict: UserAPIKeyAuth): ...
    async def send_alert_to_pagerduty(self, alert_message: str, custom_details: dict): ...

Import

from litellm_enterprise.enterprise_callbacks.pagerduty.pagerduty import PagerDutyAlerting

I/O Contract

Inputs

Parameter	Type	Description
`alerting_args`	`Optional[Union[AlertingConfig, dict]]`	Configuration for failure/hanging thresholds and time windows.
`PAGERDUTY_API_KEY` (env)	`str`	PagerDuty Events API v2 routing key (environment variable).

AlertingConfig fields:

Field	Type	Default	Description
`failure_threshold`	`int`	60	Number of failures to trigger alert.
`failure_threshold_window_seconds`	`int`	60	Time window for counting failures.
`hanging_threshold_seconds`	`int`	60	Seconds before a request is considered hanging.
`hanging_threshold_window_seconds`	`int`	600	Time window for counting hanging requests.

Outputs

Output	Type	Description
PagerDuty API response	`httpx.Response`	HTTP response from `https://events.pagerduty.com/v2/enqueue`.

Usage Examples

# In LiteLLM proxy config YAML
litellm_settings:
  callbacks:
    - pagerduty
  pagerduty_alerting_args:
    failure_threshold: 100
    failure_threshold_window_seconds: 120
    hanging_threshold_seconds: 30
    hanging_threshold_window_seconds: 300

# Programmatic usage
from litellm_enterprise.enterprise_callbacks.pagerduty.pagerduty import PagerDutyAlerting

alerter = PagerDutyAlerting(
    alerting_args={
        "failure_threshold": 50,
        "failure_threshold_window_seconds": 60,
        "hanging_threshold_seconds": 45,
        "hanging_threshold_window_seconds": 300,
    }
)

Related Pages

BerriAI_Litellm_Callback_Controls -- Dynamic callback enable/disable controls
BerriAI_Litellm_Base_Email_Alerting -- Email-based alerting for budget events

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment