Implementation:Langchain ai Langchain ChatGroq

Knowledge Sources	Langchain_ai_Langchain
Domains	LLM Framework, Groq, Chat Models, Tool Calling
Last Updated	2026-02-11 00:00 GMT

Overview

The ChatGroq class provides a LangChain chat model integration with the Groq API, supporting high-speed inference with tool calling, structured output, streaming, reasoning output, and vision capabilities.

Description

This module implements ChatGroq as a subclass of LangChain's BaseChatModel in the langchain-groq partner package. It wraps the Groq Python client library to provide an OpenAI-compatible chat completions interface optimized for low-latency inference. The class supports tool binding, structured output (function calling, JSON schema, and JSON mode), streaming, reasoning format control (parsed/raw/hidden), reasoning effort configuration, service tier selection (on_demand/flex/auto), and vision inputs. The module includes comprehensive helper functions for message conversion, token usage metadata extraction, and Groq-specific handling of null tool arguments.

Usage

Import ChatGroq from langchain_groq and instantiate with a model name. Requires the GROQ_API_KEY environment variable or explicit api_key parameter.

Code Reference

Source Location

Repository: Langchain_ai_Langchain
File: libs/partners/groq/langchain_groq/chat_models.py
Lines: 1-1593

Signature

class ChatGroq(BaseChatModel):
    client: Any = Field(default=None, exclude=True)
    async_client: Any = Field(default=None, exclude=True)
    model_name: str = Field(alias="model")
    temperature: float = 0.7
    stop: list[str] | str | None = Field(default=None, alias="stop_sequences")
    reasoning_format: Literal["parsed", "raw", "hidden"] | None = Field(default=None)
    reasoning_effort: str | None = Field(default=None)
    model_kwargs: dict[str, Any] = Field(default_factory=dict)
    groq_api_key: SecretStr | None = Field(alias="api_key", ...)
    groq_api_base: str | None = Field(alias="base_url", ...)
    request_timeout: float | tuple[float, float] | Any | None = Field(default=None, alias="timeout")
    max_retries: int = 2
    streaming: bool = False
    n: int = 1
    max_tokens: int | None = None
    service_tier: Literal["on_demand", "flex", "auto"] = Field(default="on_demand")
    default_headers: Mapping[str, str] | None = None
    http_client: Any | None = None
    http_async_client: Any | None = None

Import

from langchain_groq import ChatGroq

I/O Contract

Inputs

Name	Type	Required	Description
model_name / model	str	Yes	Model name (e.g., `"llama-3.1-8b-instant"`, `"openai/gpt-oss-120b"`)
temperature	float	No	Sampling temperature (default: 0.7, auto-adjusted to 1e-8 if set to 0)
stop	list[str] or str or None	No	Default stop sequences. Alias: `stop_sequences`
reasoning_format	`"parsed"` / `"raw"` / `"hidden"` or None	No	Controls reasoning output format for supported models
reasoning_effort	str or None	No	Controls how much effort the model puts into reasoning
max_tokens	int or None	No	Maximum number of tokens to generate
groq_api_key	SecretStr or None	No	API key. Alias: `api_key`. Env: `GROQ_API_KEY`
groq_api_base	str or None	No	Base URL. Alias: `base_url`. Env: `GROQ_API_BASE`
request_timeout	float or tuple or None	No	Request timeout. Alias: `timeout`
max_retries	int	No	Maximum retries (default: 2)
streaming	bool	No	Whether to stream results (default: False)
n	int	No	Number of completions per prompt (default: 1)
service_tier	`"on_demand"` / `"flex"` / `"auto"`	No	Service tier for requests (default: `"on_demand"`)
model_kwargs	dict	No	Additional parameters passed to the API

Outputs

Name	Type	Description
ChatResult	`ChatResult`	Contains `ChatGeneration` objects with `AIMessage` content, tool calls, reasoning content, and usage metadata
ChatGenerationChunk	`ChatGenerationChunk`	Streaming chunks containing `AIMessageChunk` with incremental content

Key Mechanisms

Reasoning Output

The reasoning_format parameter controls how reasoning is returned:

parsed: Reasoning is separated into additional_kwargs["reasoning_content"]
raw: Reasoning appears within <think> tags in the content
hidden: Only the final answer is returned; reasoning is suppressed

Service Tier

The service_tier parameter selects the processing tier:

on_demand: Default guaranteed processing
flex: Best-effort processing with rapid timeouts
auto: Falls back from on-demand to flex when rate limits are exceeded

Vision Support

Vision-capable models (e.g., meta-llama/llama-4-scout-17b-16e-instruct) accept image content blocks. The _format_message_content helper converts LangChain data content blocks to Groq's image_url format using convert_to_openai_data_block.

Null Tool Arguments Handling

Groq sends JSON null for tools with no arguments, but LangChain expects "{}". Both _convert_dict_to_message and _convert_chunk_to_message_chunk normalize this automatically.

Usage Metadata

The _create_usage_metadata function creates UsageMetadata with support for both Responses API format (input_tokens) and Chat Completions format (prompt_tokens), including token detail breakdowns for cached/reasoning tokens.

Streaming Guard

The _should_stream override disables streaming when response_format is set to json_schema or json_object, as Groq does not support streaming in these modes.

Usage Examples

Basic Usage

from langchain_groq import ChatGroq

model = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.0,
    max_retries=2,
)

messages = [
    ("system", "You are a helpful translator. Translate to French."),
    ("human", "I love programming."),
]
response = model.invoke(messages)
print(response.content)

Tool Calling

from pydantic import BaseModel, Field
from langchain_groq import ChatGroq

class GetWeather(BaseModel):
    """Get the current weather in a given location."""
    location: str = Field(description="City and state, e.g. San Francisco, CA")

model = ChatGroq(model="llama-3.1-8b-instant")
model_with_tools = model.bind_tools([GetWeather])
ai_msg = model_with_tools.invoke("What's the weather in NYC?")
print(ai_msg.tool_calls)

Structured Output

from pydantic import BaseModel, Field
from langchain_groq import ChatGroq

class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")

model = ChatGroq(model="openai/gpt-oss-120b", temperature=0)
structured_model = model.with_structured_output(Joke)
result = structured_model.invoke("Tell me a joke about cats")
print(result)

Reasoning Output

from langchain_groq import ChatGroq

model = ChatGroq(
    model="deepseek-r1-distill-llama-70b",
    reasoning_format="parsed",
)
response = model.invoke("What is 25 * 37?")
print(response.additional_kwargs.get("reasoning_content"))
print(response.content)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment