Workflow:Mistralai Client python Chat Completion

Knowledge Sources	Mistral AI Python Client Mistral AI Docs
Domains	LLMs, Chat_Completion, Python_SDK
Last Updated	2026-02-15 14:00 GMT

Overview

End-to-end process for sending chat completion requests to the Mistral AI API and receiving model-generated responses using the Python SDK.

Description

This workflow covers the standard procedure for interacting with Mistral AI models through the chat completion API. It demonstrates how to initialize the SDK client with authentication credentials, construct message payloads with system and user roles, send synchronous or asynchronous requests, and process the model's response. The workflow supports both single-turn and multi-turn conversations, with options for controlling output format (text, JSON, structured), temperature, and other generation parameters.

Usage

Execute this workflow when you need to generate text responses from a Mistral AI model given user prompts. Typical use cases include building chatbots, generating content, answering questions, and any scenario requiring natural language generation from Mistral's hosted models via the standard API (not Azure or GCP deployments).

Execution Steps

Step 1: Install SDK and Configure Authentication

Install the mistralai Python package and configure API key authentication. The API key should be obtained from the Mistral AI console and stored as an environment variable for security.

Key considerations:

Use environment variables (MISTRAL_API_KEY) rather than hardcoding credentials
The SDK supports uv, pip, and poetry for installation
Python 3.10 or higher is required

Step 2: Initialize the Mistral Client

Create an instance of the Mistral client class, passing the API key. The client supports both context manager usage (recommended for resource cleanup) and direct instantiation. An optional retry configuration and custom HTTP client can be provided.

Key considerations:

Use the context manager pattern (with statement) for proper resource management
The client manages both sync and async HTTP connections internally
Server URL defaults to the EU production endpoint

Step 3: Construct the Message Payload

Build the list of messages representing the conversation. Each message has a role (system, user, or assistant) and content. System messages set the model's behavior, user messages contain the prompt, and assistant messages provide conversation history for multi-turn interactions.

Key considerations:

Messages can be passed as dicts with role/content keys or as typed model objects (UserMessage, SystemMessage, AssistantMessage)
For multi-turn conversations, include the full message history
Content can include text, images, or document references depending on the model

Step 4: Send the Chat Completion Request

Call the chat.complete() method (sync) or chat.complete_async() method (async) with the model identifier, message list, and optional parameters such as temperature, max_tokens, response_format, and tool definitions.

Key considerations:

Model ID must be a valid Mistral model (e.g., mistral-large-latest, mistral-tiny)
Set stream=False for non-streaming completion (full response returned at once)
Response format can be text, json_object, or json_schema for structured outputs
Temperature and top_p control randomness of output

Step 5: Process the Response

Extract the generated text from the response object. The response contains a list of choices, each with a message containing the model's content, role, and optional tool calls. Usage information (token counts) is also available.

Key considerations:

Access the primary response via response.choices[0].message.content
Check finish_reason to determine if the response was complete or truncated
Token usage is available in response.usage (prompt_tokens, completion_tokens, total_tokens)
Handle errors using try/except with MistralError as the base exception class

Execution Diagram

GitHub URL

Workflow Repository