Workflow:Googleapis Python genai Text Content Generation
| Knowledge Sources | |
|---|---|
| Domains | LLMs, Text_Generation, Generative_AI |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
End-to-end process for generating text content using the Google GenAI SDK with Gemini models, supporting both synchronous and asynchronous execution with optional streaming.
Description
This workflow covers the primary use case of the Google GenAI Python SDK: generating text from text prompts using Gemini models. The process involves initializing a client with either the Gemini Developer API (API key) or Vertex AI (Google Cloud credentials), configuring generation parameters (temperature, top_p, top_k, max tokens), providing text content, and receiving generated text responses. The workflow supports four execution modes: synchronous non-streaming, synchronous streaming, asynchronous non-streaming, and asynchronous streaming.
Usage
Execute this workflow when you need to generate text content from a Gemini model using a text prompt. This is the fundamental building block for any application using the Google GenAI SDK, whether for question answering, summarization, creative writing, code generation, or any other text-to-text task. Choose the appropriate execution mode based on your application requirements: streaming for real-time UIs, async for high-throughput servers.
Execution Steps
Step 1: Client Initialization
Create a GenAI client configured for either the Gemini Developer API or Vertex AI. For the Gemini Developer API, provide an API key directly or via the GEMINI_API_KEY environment variable. For Vertex AI, provide project ID and location, or set GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, and GOOGLE_CLOUD_LOCATION environment variables. The client composes all API modules (models, chats, files, etc.) and sets up the underlying HTTP transport.
Key considerations:
- Choose between Gemini Developer API (simpler, API key) and Vertex AI (enterprise, Google Cloud credentials)
- Use context managers to ensure proper resource cleanup
- Optionally configure HTTP options for API version, base URL, proxy, or custom client settings
Step 2: Content Preparation
Format the input content for the generate_content call. The SDK accepts multiple input formats: a plain string, a list of strings, a Part object, a Content object, or a list of Content objects. The SDK automatically converts simpler inputs into the canonical list[Content] format, assigning appropriate roles (user for text, model for function calls).
Key considerations:
- Plain strings are wrapped as user content with a single text part
- Multiple strings become multiple parts within a single user content
- System instructions are passed separately via the config parameter, not in contents
Step 3: Generation Configuration
Configure the generation parameters to control model output behavior. Key parameters include temperature (randomness), top_p (nucleus sampling), top_k (token selection), max_output_tokens (response length limit), system_instruction (behavioral guidance), safety_settings (content filtering thresholds), and response_mime_type for structured output (JSON, enum).
Key considerations:
- Lower temperature values produce more deterministic outputs
- System instructions influence model behavior without consuming input token budget in the same way
- Safety settings can be configured per harm category with different threshold levels
- For structured output, set response_mime_type to application/json and provide a response_schema
Step 4: Content Generation
Invoke the generate_content method (or generate_content_stream for streaming) on the client.models module. For synchronous non-streaming, the method returns a complete GenerateContentResponse. For streaming, it returns an iterator of response chunks. Async variants are available via client.aio.models.
Key considerations:
- Streaming returns partial responses as they are generated, enabling real-time display
- The response object provides convenience properties: .text for the text content, .parts for all parts, .function_calls for function call parts
- Error handling should catch errors.APIError for model service errors
Step 5: Response Processing
Extract and process the generated content from the response. Access the text via response.text for simple text responses. For structured JSON responses, use response.parsed to get the deserialized object. For streaming, iterate over chunks and concatenate chunk.text values. Check response.candidates for detailed information including finish reason and safety ratings.
Key considerations:
- response.text returns the concatenated text from all parts of the first candidate
- Streaming chunks may contain partial text that should be concatenated
- Check finish_reason to understand why generation stopped (STOP, MAX_TOKENS, SAFETY, etc.)
- Token usage information is available in response.usage_metadata