Heuristic:Googleapis Python genai LRO Polling Backoff

Knowledge Sources	googleapis/python-genai
Domains	Reliability, Long_Running_Operations
Last Updated	2026-02-15 14:00 GMT

Overview

Long-running operations (tuning jobs, video generation) are polled with 1.5x exponential backoff starting at 1 second, capped at 20 seconds, with a 15-minute total timeout.

Description

When the SDK submits a long-running operation (e.g., model fine-tuning, video generation, image operations that return an operation ID), it enters a polling loop that repeatedly checks the operation status via GET requests. The polling interval uses exponential backoff with a 1.5x multiplier, which is more conservative than the standard 2x used for error retries. This reduces unnecessary API calls while still detecting completion promptly.

The total polling timeout is 900 seconds (15 minutes), after which the SDK raises a `RuntimeError` with the operation details.

Usage

This heuristic applies to any SDK method that triggers a long-running operation, primarily:

Model fine-tuning via `tunings.tune()` — tuning jobs can run for hours
Video generation via `models.generate_videos()` — video processing is asynchronous
Image operations that return LRO responses

Be aware that the 15-minute timeout may be insufficient for long fine-tuning jobs. For those, consider polling manually using `tunings.get()` with custom intervals.

The Insight (Rule of Thumb)

Action: The SDK automatically polls LROs with exponential backoff. The polling parameters are hardcoded in `_transformers.py`.
Values:
- Initial delay: 1.0 second
- Maximum delay: 20 seconds
- Backoff multiplier: 1.5x (more conservative than 2.0x for retries)
- Total timeout: 900 seconds (15 minutes)
Trade-off: The 1.5x multiplier polls more frequently than a 2x multiplier, providing faster completion detection at the cost of slightly more API calls. The 15-minute timeout is adequate for most image/video operations but may be too short for fine-tuning jobs.
Polling sequence: 1.0s, 1.5s, 2.25s, 3.375s, 5.06s, 7.59s, 11.39s, 17.09s, 20s, 20s, 20s...

Reasoning

The 1.5x multiplier is deliberately more conservative than the 2.0x used for error retries:

LRO operations have variable completion times: A video generation might take 30 seconds or 10 minutes. More frequent polling in the early stages catches short operations quickly.
20-second cap: Once the delay reaches 20 seconds, it stays there — this avoids polling too infrequently for operations that may be nearly complete.
15-minute timeout: Provides a safety net against infinite polling. Most image/video operations complete well within this window.
No jitter: Unlike error retries, LRO polling does not add jitter because there is typically only one client polling one specific operation (no thundering herd risk).

Code Evidence

LRO polling constants from `_transformers.py:1136-1139`:

LRO_POLLING_INITIAL_DELAY_SECONDS = 1.0
LRO_POLLING_MAXIMUM_DELAY_SECONDS = 20.0
LRO_POLLING_TIMEOUT_SECONDS = 900.0
LRO_POLLING_MULTIPLIER = 1.5

Polling loop implementation from `_transformers.py:1142-1167`:

def t_resolve_operation(
    api_client: _api_client.BaseApiClient, struct: _common.StringDict
) -> Any:
    if (name := struct.get('name')) and '/operations/' in name:
        operation: _common.StringDict = struct
        total_seconds = 0.0
        delay_seconds = LRO_POLLING_INITIAL_DELAY_SECONDS
        while operation.get('done') != True:
            if total_seconds > LRO_POLLING_TIMEOUT_SECONDS:
                raise RuntimeError(f'Operation {name} timed out.\n{operation}')
            operation = api_client.request(
                http_method='GET', path=name, request_dict={}
            )
            time.sleep(delay_seconds)
            total_seconds += total_seconds  # Note: accumulator pattern
            delay_seconds = min(
                delay_seconds * LRO_POLLING_MULTIPLIER,
                LRO_POLLING_MAXIMUM_DELAY_SECONDS,
            )

Timeout error handling from `_transformers.py:1163-1167`:

    if error := operation.get('error'):
        raise RuntimeError(
            f'Operation {name} failed with error: {error}.\n{operation}'
        )
    return operation.get('response')

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment