Heuristic:CrewAIInc CrewAI Rate Limiting Strategy

Knowledge Sources	CrewAI API rate limit handling
Domains	LLM_Integration, Resilience
Last Updated	2026-02-11 17:00 GMT

Overview

Thread-safe rate limiting via `RPMController` with hard 60-second blocking on limit breach, daemon timer-based counter reset, and opt-in configuration via `max_rpm` parameter.

Description

CrewAI's `RPMController` provides thread-safe requests-per-minute (RPM) limiting for LLM API calls. When `max_rpm` is set on an Agent, the controller tracks request counts using a threading Lock and resets the counter every 60 seconds via a daemon Timer. When the RPM limit is reached, the controller blocks the calling thread for 60 seconds with a hard `time.sleep(60)` call, then resets the counter and continues. This is an opt-in feature: `max_rpm=None` (the default) disables rate limiting entirely.

Usage

Apply this heuristic when configuring agents that call rate-limited LLM APIs. Set `max_rpm` on your Agent to stay within your API provider's rate limits. Be aware that the blocking behavior means the entire agent execution pauses for 60 seconds when the limit is hit. For multi-agent crews, each agent has its own RPMController, so limits are per-agent, not per-crew.

The Insight (Rule of Thumb)

Action: Set `max_rpm` on Agent to match your API provider's rate limit
Value: `max_rpm=None` (default, no limiting), set to provider's RPM limit (e.g., 60 for many free tiers)
Trade-off: Hard blocking prevents API lockout but pauses execution for 60 seconds; no gradual backoff
Scope: Per-agent, not per-crew; each agent independently tracks its RPM
Thread Safety: Uses `threading.Lock` for counter access and `threading.Timer` (daemon) for periodic reset

Reasoning

LLM API providers enforce strict rate limits. Exceeding them causes HTTP 429 errors and can result in temporary lockout. The hard 60-second sleep is intentionally aggressive: it is simpler and more predictable than adaptive backoff, and it guarantees the counter resets before resuming. The daemon thread approach ensures the timer does not prevent application shutdown. Using `None` as the default makes rate limiting opt-in, avoiding unnecessary overhead for users whose API plans have generous limits.

Code Evidence

RPM controller with hard blocking from `lib/crewai/src/crewai/utilities/rpm_controller.py:38-64`:

def check_or_wait(self) -> bool:
    if self.max_rpm is None:
        return True

    def _check_and_increment() -> bool:
        if self.max_rpm is not None and self._current_rpm < self.max_rpm:
            self._current_rpm += 1
            return True
        if self.max_rpm is not None:
            self.logger.log(
                "info", "Max RPM reached, waiting for next minute to start."
            )
            self._wait_for_next_minute()
            self._current_rpm = 1
            return True
        return True

    if self._lock:
        with self._lock:
            return _check_and_increment()
    else:
        return _check_and_increment()

Hard 60-second wait and daemon timer from `lib/crewai/src/crewai/utilities/rpm_controller.py:73-89`:

def _wait_for_next_minute(self) -> None:
    time.sleep(60)
    self._current_rpm = 0

def _reset_request_count(self) -> None:
    def _reset():
        self._current_rpm = 0
        if not self._shutdown_flag:
            self._timer = threading.Timer(60.0, self._reset_request_count)
            self._timer.daemon = True
            self._timer.start()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment