Implementation:Langchain ai Langchain ChatGroq
| Knowledge Sources | |
|---|---|
| Domains | LLM Framework, Groq, Chat Models, Tool Calling |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
The ChatGroq class provides a LangChain chat model integration with the Groq API, supporting high-speed inference with tool calling, structured output, streaming, reasoning output, and vision capabilities.
Description
This module implements ChatGroq as a subclass of LangChain's BaseChatModel in the langchain-groq partner package. It wraps the Groq Python client library to provide an OpenAI-compatible chat completions interface optimized for low-latency inference. The class supports tool binding, structured output (function calling, JSON schema, and JSON mode), streaming, reasoning format control (parsed/raw/hidden), reasoning effort configuration, service tier selection (on_demand/flex/auto), and vision inputs. The module includes comprehensive helper functions for message conversion, token usage metadata extraction, and Groq-specific handling of null tool arguments.
Usage
Import ChatGroq from langchain_groq and instantiate with a model name. Requires the GROQ_API_KEY environment variable or explicit api_key parameter.
Code Reference
Source Location
- Repository: Langchain_ai_Langchain
- File:
libs/partners/groq/langchain_groq/chat_models.py - Lines: 1-1593
Signature
class ChatGroq(BaseChatModel):
client: Any = Field(default=None, exclude=True)
async_client: Any = Field(default=None, exclude=True)
model_name: str = Field(alias="model")
temperature: float = 0.7
stop: list[str] | str | None = Field(default=None, alias="stop_sequences")
reasoning_format: Literal["parsed", "raw", "hidden"] | None = Field(default=None)
reasoning_effort: str | None = Field(default=None)
model_kwargs: dict[str, Any] = Field(default_factory=dict)
groq_api_key: SecretStr | None = Field(alias="api_key", ...)
groq_api_base: str | None = Field(alias="base_url", ...)
request_timeout: float | tuple[float, float] | Any | None = Field(default=None, alias="timeout")
max_retries: int = 2
streaming: bool = False
n: int = 1
max_tokens: int | None = None
service_tier: Literal["on_demand", "flex", "auto"] = Field(default="on_demand")
default_headers: Mapping[str, str] | None = None
http_client: Any | None = None
http_async_client: Any | None = None
Import
from langchain_groq import ChatGroq
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name / model | str | Yes | Model name (e.g., "llama-3.1-8b-instant", "openai/gpt-oss-120b")
|
| temperature | float | No | Sampling temperature (default: 0.7, auto-adjusted to 1e-8 if set to 0) |
| stop | list[str] or str or None | No | Default stop sequences. Alias: stop_sequences
|
| reasoning_format | "parsed" / "raw" / "hidden" or None |
No | Controls reasoning output format for supported models |
| reasoning_effort | str or None | No | Controls how much effort the model puts into reasoning |
| max_tokens | int or None | No | Maximum number of tokens to generate |
| groq_api_key | SecretStr or None | No | API key. Alias: api_key. Env: GROQ_API_KEY
|
| groq_api_base | str or None | No | Base URL. Alias: base_url. Env: GROQ_API_BASE
|
| request_timeout | float or tuple or None | No | Request timeout. Alias: timeout
|
| max_retries | int | No | Maximum retries (default: 2) |
| streaming | bool | No | Whether to stream results (default: False) |
| n | int | No | Number of completions per prompt (default: 1) |
| service_tier | "on_demand" / "flex" / "auto" |
No | Service tier for requests (default: "on_demand")
|
| model_kwargs | dict | No | Additional parameters passed to the API |
Outputs
| Name | Type | Description |
|---|---|---|
| ChatResult | ChatResult |
Contains ChatGeneration objects with AIMessage content, tool calls, reasoning content, and usage metadata
|
| ChatGenerationChunk | ChatGenerationChunk |
Streaming chunks containing AIMessageChunk with incremental content
|
Key Mechanisms
Reasoning Output
The reasoning_format parameter controls how reasoning is returned:
parsed: Reasoning is separated intoadditional_kwargs["reasoning_content"]raw: Reasoning appears within<think>tags in the contenthidden: Only the final answer is returned; reasoning is suppressed
Service Tier
The service_tier parameter selects the processing tier:
on_demand: Default guaranteed processingflex: Best-effort processing with rapid timeoutsauto: Falls back from on-demand to flex when rate limits are exceeded
Vision Support
Vision-capable models (e.g., meta-llama/llama-4-scout-17b-16e-instruct) accept image content blocks. The _format_message_content helper converts LangChain data content blocks to Groq's image_url format using convert_to_openai_data_block.
Null Tool Arguments Handling
Groq sends JSON null for tools with no arguments, but LangChain expects "{}". Both _convert_dict_to_message and _convert_chunk_to_message_chunk normalize this automatically.
Usage Metadata
The _create_usage_metadata function creates UsageMetadata with support for both Responses API format (input_tokens) and Chat Completions format (prompt_tokens), including token detail breakdowns for cached/reasoning tokens.
Streaming Guard
The _should_stream override disables streaming when response_format is set to json_schema or json_object, as Groq does not support streaming in these modes.
Usage Examples
Basic Usage
from langchain_groq import ChatGroq
model = ChatGroq(
model="llama-3.1-8b-instant",
temperature=0.0,
max_retries=2,
)
messages = [
("system", "You are a helpful translator. Translate to French."),
("human", "I love programming."),
]
response = model.invoke(messages)
print(response.content)
Tool Calling
from pydantic import BaseModel, Field
from langchain_groq import ChatGroq
class GetWeather(BaseModel):
"""Get the current weather in a given location."""
location: str = Field(description="City and state, e.g. San Francisco, CA")
model = ChatGroq(model="llama-3.1-8b-instant")
model_with_tools = model.bind_tools([GetWeather])
ai_msg = model_with_tools.invoke("What's the weather in NYC?")
print(ai_msg.tool_calls)
Structured Output
from pydantic import BaseModel, Field
from langchain_groq import ChatGroq
class Joke(BaseModel):
"""Joke to tell user."""
setup: str = Field(description="The setup of the joke")
punchline: str = Field(description="The punchline to the joke")
model = ChatGroq(model="openai/gpt-oss-120b", temperature=0)
structured_model = model.with_structured_output(Joke)
result = structured_model.invoke("Tell me a joke about cats")
print(result)
Reasoning Output
from langchain_groq import ChatGroq
model = ChatGroq(
model="deepseek-r1-distill-llama-70b",
reasoning_format="parsed",
)
response = model.invoke("What is 25 * 37?")
print(response.additional_kwargs.get("reasoning_content"))
print(response.content)