Workflow:Mistralai Client python GCP Chat Completion
| Knowledge Sources | |
|---|---|
| Domains | LLMs, Chat_Completion, GCP, Vertex_AI, Cloud_Deployment |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
End-to-end process for using Mistral AI models deployed on Google Cloud Vertex AI through the dedicated GCP Python SDK wrapper.
Description
This workflow covers how to interact with Mistral AI models deployed on Google Cloud's Vertex AI platform. It uses the mistralai_gcp package, which wraps the standard Mistral SDK with GCP-specific authentication (Google Application Default Credentials), automatic URL rewriting to the Vertex AI rawPredict endpoint format, and model ID translation for legacy model naming conventions. The GCP wrapper supports Chat Completion and Fill-in-the-Middle (FIM) APIs with both synchronous and asynchronous request patterns.
Usage
Execute this workflow when you have enabled the Mistral API on Google Cloud Vertex AI and need to interact with Mistral models through GCP infrastructure. This is appropriate for organizations using Google Cloud that require GCP-managed authentication, networking, and compliance controls.
Execution Steps
Step 1: Enable Mistral on GCP and Authenticate
Create a Google Cloud project, enable the Mistral API on Vertex AI, and authenticate locally using Google Application Default Credentials (ADC). The SDK automatically uses ADC to obtain access tokens.
Key considerations:
- Run gcloud auth application-default login for local development
- The SDK calls google.auth.default() to load credentials automatically
- Credentials are scoped to cloud-platform for Vertex AI access
- Project ID can be auto-detected from ADC or provided explicitly
Step 2: Install the GCP SDK Package
Install the mistralai package with GCP extras to include the Google authentication dependencies (google-auth, google-auth-oauthlib).
Key considerations:
- Install with pip install "mistralai[gcp]" or pip install mistralai_gcp directly
- Additional dependency: google-auth for Application Default Credentials
Step 3: Initialize the MistralGoogleCloud Client
Create an instance of MistralGoogleCloud, optionally providing the GCP region and project ID. The client automatically obtains and refreshes OAuth tokens from ADC, sets the Vertex AI base URL, and registers request hooks for URL path rewriting.
Key considerations:
- Region defaults to europe-west4 but can be customized
- Project ID is auto-detected from ADC or can be provided explicitly
- An explicit access_token can be provided to bypass ADC
- The client registers a BeforeRequestHook that rewrites request URLs to the Vertex AI rawPredict format
Step 4: Send Chat or FIM Completion Request
Call chat.complete() or chat.complete_async() for chat completion, or fim.complete() for code fill-in-the-middle. The request hook automatically rewrites the URL to the Vertex AI endpoint format, including the project ID, region, and model identifier.
Key considerations:
- The model parameter uses Mistral model names (e.g., mistral-large-2407)
- Legacy model IDs are automatically translated (e.g., codestral-2405 becomes codestral@2405)
- URL is rewritten to: /v1/projects/{project}/locations/{region}/publishers/mistralai/models/{model}:rawPredict
- Streaming uses streamRawPredict instead of rawPredict
- FIM API is available on GCP but not on Azure
Step 5: Process the Response
Extract the generated text from the response object. The response format is identical to the standard Mistral SDK.
Key considerations:
- Response structure matches the standard Mistral SDK
- Token usage and finish reason are available in the response
- Error handling follows the same patterns as the standard SDK