Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Google GenAI

From Leeroopedia
Revision as of 12:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/BerriAI_Litellm_Google_GenAI.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Property Value
sources litellm/google_genai/main.py
domains Google GenAI, Gemini, Content Generation, Streaming
last_updated 2026-02-15 16:00 GMT

Overview

The Google GenAI module provides a native SDK-style interface for Google's Generative AI (Gemini) content generation, supporting both streaming and non-streaming responses with an automatic fallback to LiteLLM's completion format when provider-native support is unavailable.

Description

This module implements the Google GenAI generateContent API through four function pairs: generate_content/agenerate_content for non-streaming and generate_content_stream/agenerate_content_stream for streaming. The implementation uses a helper class GenerateContentHelper with a shared setup_generate_content_call() method that resolves the provider, loads the BaseGoogleGenAIGenerateContentConfig, maps optional parameters, and constructs the request body. A GenerateContentSetupResult Pydantic model carries the setup state between the helper and the calling functions. When no provider config is found (e.g., for non-Gemini providers), the module automatically falls back to GenerateContentToCompletionHandler, which adapts the Google GenAI format to LiteLLM's standard completion format. Mock responses and backward compatibility with the generationConfig parameter are also supported.

Usage

Import this module when you need to interact with Google Gemini models using Google's native API format (contents, config, tools) rather than the OpenAI-compatible format. It is also used internally by LiteLLM when routing to Google GenAI providers.

Code Reference

Source Location

Property Value
Repository github.com/BerriAI/litellm
File litellm/google_genai/main.py
Lines 543
Module litellm.google_genai.main

Signature

@client
def generate_content(
    model: str,
    contents: GenerateContentContentListUnionDict,
    config: Optional[GenerateContentConfigDict] = None,
    tools: Optional[ToolConfigDict] = None,
    extra_headers: Optional[Dict[str, Any]] = None,
    extra_query: Optional[Dict[str, Any]] = None,
    extra_body: Optional[Dict[str, Any]] = None,
    timeout: Optional[Union[float, httpx.Timeout]] = None,
    custom_llm_provider: Optional[str] = None,
    **kwargs,
) -> Any

@client
async def agenerate_content(...) -> Any

@client
def generate_content_stream(...) -> Iterator[Any]

@client
async def agenerate_content_stream(...) -> Any

Import

from litellm.google_genai.main import (
    generate_content, agenerate_content,
    generate_content_stream, agenerate_content_stream,
    GenerateContentHelper, GenerateContentSetupResult,
)

I/O Contract

Inputs

Parameter Type Required Description
model str Yes The model identifier (e.g., "google_genai/gemini-2.0-flash")
contents GenerateContentContentListUnionDict Yes The content list to generate from (Google GenAI format)
config Optional[GenerateContentConfigDict] No Generation configuration (temperature, top_p, etc.)
tools Optional[ToolConfigDict] No Tool definitions for function calling
custom_llm_provider Optional[str] No Provider override; auto-detected from model
systemInstruction via kwargs No System instruction for the model
generationConfig via kwargs No Backward-compatible alias for config

Outputs

Function Return Type Description
generate_content Dict[str, Any] (GenerateContentResponse) Contains candidates, usage metadata, text
generate_content_stream Iterator[Any] Stream of partial generation results
Mock response Dict[str, Any] Mock response with text, candidates, usageMetadata fields

Usage Examples

import litellm

response = litellm.generate_content(
    model="google_genai/gemini-2.0-flash",
    contents=[{"role": "user", "parts": [{"text": "Explain quantum computing."}]}],
    config={"temperature": 0.7, "max_output_tokens": 500},
)
print(response["text"])
import asyncio
import litellm

async def main():
    stream = await litellm.agenerate_content_stream(
        model="google_genai/gemini-2.0-flash",
        contents=[{"role": "user", "parts": [{"text": "Write a short story."}]}],
    )
    async for chunk in stream:
        print(chunk)

asyncio.run(main())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment