Workflow:Googleapis Python genai Text Content Generation

Knowledge Sources	Google GenAI Python SDK Gemini API Docs Vertex AI Docs
Domains	LLMs, Text_Generation, Generative_AI
Last Updated	2026-02-15 14:00 GMT

Overview

End-to-end process for generating text content using the Google GenAI SDK with Gemini models, supporting both synchronous and asynchronous execution with optional streaming.

Description

This workflow covers the primary use case of the Google GenAI Python SDK: generating text from text prompts using Gemini models. The process involves initializing a client with either the Gemini Developer API (API key) or Vertex AI (Google Cloud credentials), configuring generation parameters (temperature, top_p, top_k, max tokens), providing text content, and receiving generated text responses. The workflow supports four execution modes: synchronous non-streaming, synchronous streaming, asynchronous non-streaming, and asynchronous streaming.

Usage

Execute this workflow when you need to generate text content from a Gemini model using a text prompt. This is the fundamental building block for any application using the Google GenAI SDK, whether for question answering, summarization, creative writing, code generation, or any other text-to-text task. Choose the appropriate execution mode based on your application requirements: streaming for real-time UIs, async for high-throughput servers.

Execution Steps

Step 1: Client Initialization

Create a GenAI client configured for either the Gemini Developer API or Vertex AI. For the Gemini Developer API, provide an API key directly or via the GEMINI_API_KEY environment variable. For Vertex AI, provide project ID and location, or set GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, and GOOGLE_CLOUD_LOCATION environment variables. The client composes all API modules (models, chats, files, etc.) and sets up the underlying HTTP transport.

Key considerations:

Choose between Gemini Developer API (simpler, API key) and Vertex AI (enterprise, Google Cloud credentials)
Use context managers to ensure proper resource cleanup
Optionally configure HTTP options for API version, base URL, proxy, or custom client settings

Step 2: Content Preparation

Format the input content for the generate_content call. The SDK accepts multiple input formats: a plain string, a list of strings, a Part object, a Content object, or a list of Content objects. The SDK automatically converts simpler inputs into the canonical list[Content] format, assigning appropriate roles (user for text, model for function calls).

Key considerations:

Plain strings are wrapped as user content with a single text part
Multiple strings become multiple parts within a single user content
System instructions are passed separately via the config parameter, not in contents

Step 3: Generation Configuration

Configure the generation parameters to control model output behavior. Key parameters include temperature (randomness), top_p (nucleus sampling), top_k (token selection), max_output_tokens (response length limit), system_instruction (behavioral guidance), safety_settings (content filtering thresholds), and response_mime_type for structured output (JSON, enum).

Key considerations:

Lower temperature values produce more deterministic outputs
System instructions influence model behavior without consuming input token budget in the same way
Safety settings can be configured per harm category with different threshold levels
For structured output, set response_mime_type to application/json and provide a response_schema

Step 4: Content Generation

Invoke the generate_content method (or generate_content_stream for streaming) on the client.models module. For synchronous non-streaming, the method returns a complete GenerateContentResponse. For streaming, it returns an iterator of response chunks. Async variants are available via client.aio.models.

Key considerations:

Streaming returns partial responses as they are generated, enabling real-time display
The response object provides convenience properties: .text for the text content, .parts for all parts, .function_calls for function call parts
Error handling should catch errors.APIError for model service errors

Step 5: Response Processing

Extract and process the generated content from the response. Access the text via response.text for simple text responses. For structured JSON responses, use response.parsed to get the deserialized object. For streaming, iterate over chunks and concatenate chunk.text values. Check response.candidates for detailed information including finish reason and safety ratings.

Key considerations:

response.text returns the concatenated text from all parts of the first candidate
Streaming chunks may contain partial text that should be concatenated
Check finish_reason to understand why generation stopped (STOP, MAX_TOKENS, SAFETY, etc.)
Token usage information is available in response.usage_metadata

Execution Diagram

GitHub URL

Workflow Repository