Workflow:Googleapis Python genai Image Generation Pipeline
| Knowledge Sources | |
|---|---|
| Domains | Image_Generation, Generative_AI, Computer_Vision |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
End-to-end process for generating, upscaling, and editing images using the Google GenAI SDK with Imagen and Gemini models.
Description
This workflow covers the complete image generation pipeline using the Google GenAI SDK. It supports three primary operations: generating images from text prompts using Imagen 4.0, upscaling generated images to higher resolutions using the Imagen upscale model, and editing existing images using inpainting, outpainting, and other edit modes via the Imagen capability model. Additionally, Gemini models with image generation capabilities can produce images inline as part of content generation.
Usage
Execute this workflow when your application needs to create images from text descriptions, enhance image resolution, or modify existing images. Typical use cases include marketing content creation, product visualization, creative design, background replacement, and image enhancement. Note that upscaling and editing are only available on Vertex AI.
Execution Steps
Step 1: Client Initialization
Create a GenAI client. Image generation (generate_images) works on both Gemini Developer API and Vertex AI. Image upscaling and editing require Vertex AI. Choose the appropriate backend based on the operations needed.
Key considerations:
- generate_images works on both backends
- upscale_image and edit_image are Vertex AI only
- Gemini model image generation (via generate_content) works on both backends
Step 2: Image Generation
Generate images from text prompts using client.models.generate_images() with the Imagen model (e.g., imagen-4.0-generate-001). Configure the number of images, output MIME type, and whether to include RAI (Responsible AI) reasons for any filtered results. The response contains a list of GeneratedImage objects, each with an Image that can be displayed or saved.
Key considerations:
- Specify number_of_images for multiple variants
- Set output_mime_type to image/jpeg or image/png
- include_rai_reason provides transparency on content filtering
- Generated images are returned as base64-encoded data within Image objects
Step 3: Image Upscaling (Vertex AI Only)
Upscale a generated or existing image to higher resolution using client.models.upscale_image() with the Imagen upscale model (e.g., imagen-4.0-upscale-preview). Provide the source image and an upscale_factor (e.g., x2, x4). The result is a higher-resolution version of the input image.
Key considerations:
- Vertex AI only
- Input can be a previously generated Image object or loaded from file
- Upscale factors control the resolution multiplier
- Output MIME type can be configured
Step 4: Image Editing (Vertex AI Only)
Edit images using client.models.edit_image() with the Imagen capability model (e.g., imagen-3.0-capability-001). Provide reference images (raw source image and mask), an edit prompt, and an edit mode (inpaint insertion, inpaint removal, outpainting, etc.). The model applies the edit according to the mask and prompt.
Key considerations:
- Vertex AI only
- Requires RawReferenceImage (source) and MaskReferenceImage (edit region)
- Mask modes include MASK_MODE_BACKGROUND, MASK_MODE_FOREGROUND, and semantic masks
- Edit modes include EDIT_MODE_INPAINT_INSERTION, EDIT_MODE_INPAINT_REMOVAL, EDIT_MODE_OUTPAINT
- mask_dilation controls the boundary expansion of the mask
Step 5: Result Processing
Process the generated, upscaled, or edited images. Each result contains GeneratedImage objects with an Image property. Images can be displayed using .show(), saved to disk, or converted for further processing. Check for RAI filtering if include_rai_reason was enabled.
Key considerations:
- Use .show() for display in notebooks or supported environments
- Images can be chained: generate then upscale then edit
- Check generated_images list length as some results may be filtered by safety