Implementation:BerriAI Litellm Google GenAI

Property	Value
sources	`litellm/google_genai/main.py`
domains	Google GenAI, Gemini, Content Generation, Streaming
last_updated	2026-02-15 16:00 GMT

Overview

The Google GenAI module provides a native SDK-style interface for Google's Generative AI (Gemini) content generation, supporting both streaming and non-streaming responses with an automatic fallback to LiteLLM's completion format when provider-native support is unavailable.

Description

This module implements the Google GenAI generateContent API through four function pairs: generate_content/agenerate_content for non-streaming and generate_content_stream/agenerate_content_stream for streaming. The implementation uses a helper class GenerateContentHelper with a shared setup_generate_content_call() method that resolves the provider, loads the BaseGoogleGenAIGenerateContentConfig, maps optional parameters, and constructs the request body. A GenerateContentSetupResult Pydantic model carries the setup state between the helper and the calling functions. When no provider config is found (e.g., for non-Gemini providers), the module automatically falls back to GenerateContentToCompletionHandler, which adapts the Google GenAI format to LiteLLM's standard completion format. Mock responses and backward compatibility with the generationConfig parameter are also supported.

Usage

Import this module when you need to interact with Google Gemini models using Google's native API format (contents, config, tools) rather than the OpenAI-compatible format. It is also used internally by LiteLLM when routing to Google GenAI providers.

Code Reference

Source Location

Property	Value
Repository	github.com/BerriAI/litellm
File	`litellm/google_genai/main.py`
Lines	543
Module	`litellm.google_genai.main`

Signature

@client
def generate_content(
    model: str,
    contents: GenerateContentContentListUnionDict,
    config: Optional[GenerateContentConfigDict] = None,
    tools: Optional[ToolConfigDict] = None,
    extra_headers: Optional[Dict[str, Any]] = None,
    extra_query: Optional[Dict[str, Any]] = None,
    extra_body: Optional[Dict[str, Any]] = None,
    timeout: Optional[Union[float, httpx.Timeout]] = None,
    custom_llm_provider: Optional[str] = None,
    **kwargs,
) -> Any

@client
async def agenerate_content(...) -> Any

@client
def generate_content_stream(...) -> Iterator[Any]

@client
async def agenerate_content_stream(...) -> Any

Import

from litellm.google_genai.main import (
    generate_content, agenerate_content,
    generate_content_stream, agenerate_content_stream,
    GenerateContentHelper, GenerateContentSetupResult,
)

I/O Contract

Inputs

Parameter	Type	Required	Description
`model`	`str`	Yes	The model identifier (e.g., "google_genai/gemini-2.0-flash")
`contents`	`GenerateContentContentListUnionDict`	Yes	The content list to generate from (Google GenAI format)
`config`	`Optional[GenerateContentConfigDict]`	No	Generation configuration (temperature, top_p, etc.)
`tools`	`Optional[ToolConfigDict]`	No	Tool definitions for function calling
`custom_llm_provider`	`Optional[str]`	No	Provider override; auto-detected from model
`systemInstruction`	via kwargs	No	System instruction for the model
`generationConfig`	via kwargs	No	Backward-compatible alias for `config`

Outputs

Function	Return Type	Description
`generate_content`	`Dict[str, Any]` (GenerateContentResponse)	Contains candidates, usage metadata, text
`generate_content_stream`	`Iterator[Any]`	Stream of partial generation results
Mock response	`Dict[str, Any]`	Mock response with text, candidates, usageMetadata fields

Usage Examples

import litellm

response = litellm.generate_content(
    model="google_genai/gemini-2.0-flash",
    contents=[{"role": "user", "parts": [{"text": "Explain quantum computing."}]}],
    config={"temperature": 0.7, "max_output_tokens": 500},
)
print(response["text"])

import asyncio
import litellm

async def main():
    stream = await litellm.agenerate_content_stream(
        model="google_genai/gemini-2.0-flash",
        contents=[{"role": "user", "parts": [{"text": "Write a short story."}]}],
    )
    async for chunk in stream:
        print(chunk)

asyncio.run(main())

Related Pages

BerriAI_Litellm_Responses_API -- The OpenAI-compatible Responses API that can also route to Google GenAI providers
BerriAI_Litellm_Passthrough_API -- Passthrough API for direct access to Google GenAI endpoints

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment