Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Mistralai Client python GCP Chat Completion

From Leeroopedia
Knowledge Sources
Domains LLMs, Chat_Completion, GCP, Vertex_AI, Cloud_Deployment
Last Updated 2026-02-15 14:00 GMT

Overview

End-to-end process for using Mistral AI models deployed on Google Cloud Vertex AI through the dedicated GCP Python SDK wrapper.

Description

This workflow covers how to interact with Mistral AI models deployed on Google Cloud's Vertex AI platform. It uses the mistralai_gcp package, which wraps the standard Mistral SDK with GCP-specific authentication (Google Application Default Credentials), automatic URL rewriting to the Vertex AI rawPredict endpoint format, and model ID translation for legacy model naming conventions. The GCP wrapper supports Chat Completion and Fill-in-the-Middle (FIM) APIs with both synchronous and asynchronous request patterns.

Usage

Execute this workflow when you have enabled the Mistral API on Google Cloud Vertex AI and need to interact with Mistral models through GCP infrastructure. This is appropriate for organizations using Google Cloud that require GCP-managed authentication, networking, and compliance controls.

Execution Steps

Step 1: Enable Mistral on GCP and Authenticate

Create a Google Cloud project, enable the Mistral API on Vertex AI, and authenticate locally using Google Application Default Credentials (ADC). The SDK automatically uses ADC to obtain access tokens.

Key considerations:

  • Run gcloud auth application-default login for local development
  • The SDK calls google.auth.default() to load credentials automatically
  • Credentials are scoped to cloud-platform for Vertex AI access
  • Project ID can be auto-detected from ADC or provided explicitly

Step 2: Install the GCP SDK Package

Install the mistralai package with GCP extras to include the Google authentication dependencies (google-auth, google-auth-oauthlib).

Key considerations:

  • Install with pip install "mistralai[gcp]" or pip install mistralai_gcp directly
  • Additional dependency: google-auth for Application Default Credentials

Step 3: Initialize the MistralGoogleCloud Client

Create an instance of MistralGoogleCloud, optionally providing the GCP region and project ID. The client automatically obtains and refreshes OAuth tokens from ADC, sets the Vertex AI base URL, and registers request hooks for URL path rewriting.

Key considerations:

  • Region defaults to europe-west4 but can be customized
  • Project ID is auto-detected from ADC or can be provided explicitly
  • An explicit access_token can be provided to bypass ADC
  • The client registers a BeforeRequestHook that rewrites request URLs to the Vertex AI rawPredict format

Step 4: Send Chat or FIM Completion Request

Call chat.complete() or chat.complete_async() for chat completion, or fim.complete() for code fill-in-the-middle. The request hook automatically rewrites the URL to the Vertex AI endpoint format, including the project ID, region, and model identifier.

Key considerations:

  • The model parameter uses Mistral model names (e.g., mistral-large-2407)
  • Legacy model IDs are automatically translated (e.g., codestral-2405 becomes codestral@2405)
  • URL is rewritten to: /v1/projects/{project}/locations/{region}/publishers/mistralai/models/{model}:rawPredict
  • Streaming uses streamRawPredict instead of rawPredict
  • FIM API is available on GCP but not on Azure

Step 5: Process the Response

Extract the generated text from the response object. The response format is identical to the standard Mistral SDK.

Key considerations:

  • Response structure matches the standard Mistral SDK
  • Token usage and finish reason are available in the response
  • Error handling follows the same patterns as the standard SDK

Execution Diagram

GitHub URL

Workflow Repository