Workflow:Googleapis Python genai Model Fine Tuning
| Knowledge Sources | |
|---|---|
| Domains | LLMs, Fine_Tuning, Model_Training |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
End-to-end process for supervised fine-tuning of Gemini models using the Google GenAI SDK's tunings module, from dataset preparation through training to inference with the tuned model.
Description
This workflow covers the supervised fine-tuning of Gemini models to adapt them for specific tasks or domains. The process involves preparing a training dataset in JSONL format, uploading it to Google Cloud Storage (or using a Vertex AI Multimodal Dataset), configuring the tuning job parameters (epoch count, learning rate, adapter size), launching the job, monitoring its progress, and finally using the resulting tuned model for inference. Tuning is supported on Vertex AI.
Usage
Execute this workflow when you need a Gemini model specialized for your specific task, domain, or style that goes beyond what prompt engineering and few-shot examples can achieve. Typical use cases include domain-specific Q&A, custom instruction following, specialized content generation, and task-specific classification or extraction.
Execution Steps
Step 1: Client Initialization
Create a GenAI client configured for Vertex AI with the appropriate project and location. Tuning jobs require Vertex AI; the Gemini Developer API does not support fine-tuning.
Key considerations:
- Tuning is only available on Vertex AI (vertexai=True)
- Ensure the project has the necessary Vertex AI APIs enabled
- The location affects where training resources are provisioned
Step 2: Training Dataset Preparation
Prepare the training data as a JSONL file with instruction-tuning format and upload it to Google Cloud Storage. Each line should contain a training example in the expected schema. Alternatively, reference a Vertex AI Multimodal Dataset resource. The dataset is specified using a TuningDataset object with a gcs_uri pointing to the data location.
Key considerations:
- Data must be in JSONL format following the Gemini tuning data schema
- Upload data to a GCS bucket accessible by the Vertex AI service account
- Validation data can optionally be provided for evaluation during training
- Dataset quality and diversity directly impact tuned model performance
Step 3: Tuning Job Configuration
Configure the tuning job parameters using CreateTuningJobConfig. Key parameters include the base model to tune (e.g., gemini-2.5-flash), epoch_count (number of training passes), learning_rate_multiplier, adapter_size (for LoRA-based tuning), and tuned_model_display_name. These parameters control the trade-off between training cost, time, and model quality.
Key considerations:
- Fewer epochs reduce cost but may underfit; more epochs risk overfitting
- The adapter_size controls the capacity of the tuning adaptation
- A descriptive display name helps identify the tuned model later
- Hyperparameter tuning may require multiple experimental runs
Step 4: Tuning Job Launch
Launch the tuning job using client.tunings.tune() with the base model, training dataset, and configuration. This returns a TuningJob object containing the job name, state, and metadata. The job runs asynchronously on Vertex AI infrastructure.
Key considerations:
- The tune() method returns immediately with a job reference
- The job name is used for subsequent status checks
- Tuning jobs consume Vertex AI compute resources and incur costs
Step 5: Job Monitoring
Poll the tuning job status using client.tunings.get() in a loop until the job reaches a terminal state (JOB_STATE_SUCCEEDED, JOB_STATE_FAILED, or JOB_STATE_CANCELLED). The job object contains progress information, timestamps, and any error details.
Key considerations:
- Poll at reasonable intervals (e.g., every 10-30 seconds)
- Check job.state for the current status
- Failed jobs contain error information for debugging
- Jobs can be cancelled using client.tunings.cancel()
Step 6: Tuned Model Inference
Use the tuned model for inference by referencing tuning_job.tuned_model.endpoint as the model parameter in generate_content calls. The tuned model behaves identically to the base model but with adapted weights. The tuned model metadata can be retrieved using client.models.get().
Key considerations:
- Reference the tuned model via its endpoint for generate_content calls
- The tuned model supports all the same features as the base model (streaming, tools, etc.)
- Tuned models can be listed, updated, and deleted using the models API
- Monitor tuned model performance against your evaluation criteria