Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Langchain ai Langchain ChatGroq

From Leeroopedia
Knowledge Sources
Domains LLM Framework, Groq, Chat Models, Tool Calling
Last Updated 2026-02-11 00:00 GMT

Overview

The ChatGroq class provides a LangChain chat model integration with the Groq API, supporting high-speed inference with tool calling, structured output, streaming, reasoning output, and vision capabilities.

Description

This module implements ChatGroq as a subclass of LangChain's BaseChatModel in the langchain-groq partner package. It wraps the Groq Python client library to provide an OpenAI-compatible chat completions interface optimized for low-latency inference. The class supports tool binding, structured output (function calling, JSON schema, and JSON mode), streaming, reasoning format control (parsed/raw/hidden), reasoning effort configuration, service tier selection (on_demand/flex/auto), and vision inputs. The module includes comprehensive helper functions for message conversion, token usage metadata extraction, and Groq-specific handling of null tool arguments.

Usage

Import ChatGroq from langchain_groq and instantiate with a model name. Requires the GROQ_API_KEY environment variable or explicit api_key parameter.

Code Reference

Source Location

Signature

class ChatGroq(BaseChatModel):
    client: Any = Field(default=None, exclude=True)
    async_client: Any = Field(default=None, exclude=True)
    model_name: str = Field(alias="model")
    temperature: float = 0.7
    stop: list[str] | str | None = Field(default=None, alias="stop_sequences")
    reasoning_format: Literal["parsed", "raw", "hidden"] | None = Field(default=None)
    reasoning_effort: str | None = Field(default=None)
    model_kwargs: dict[str, Any] = Field(default_factory=dict)
    groq_api_key: SecretStr | None = Field(alias="api_key", ...)
    groq_api_base: str | None = Field(alias="base_url", ...)
    request_timeout: float | tuple[float, float] | Any | None = Field(default=None, alias="timeout")
    max_retries: int = 2
    streaming: bool = False
    n: int = 1
    max_tokens: int | None = None
    service_tier: Literal["on_demand", "flex", "auto"] = Field(default="on_demand")
    default_headers: Mapping[str, str] | None = None
    http_client: Any | None = None
    http_async_client: Any | None = None

Import

from langchain_groq import ChatGroq

I/O Contract

Inputs

Name Type Required Description
model_name / model str Yes Model name (e.g., "llama-3.1-8b-instant", "openai/gpt-oss-120b")
temperature float No Sampling temperature (default: 0.7, auto-adjusted to 1e-8 if set to 0)
stop list[str] or str or None No Default stop sequences. Alias: stop_sequences
reasoning_format "parsed" / "raw" / "hidden" or None No Controls reasoning output format for supported models
reasoning_effort str or None No Controls how much effort the model puts into reasoning
max_tokens int or None No Maximum number of tokens to generate
groq_api_key SecretStr or None No API key. Alias: api_key. Env: GROQ_API_KEY
groq_api_base str or None No Base URL. Alias: base_url. Env: GROQ_API_BASE
request_timeout float or tuple or None No Request timeout. Alias: timeout
max_retries int No Maximum retries (default: 2)
streaming bool No Whether to stream results (default: False)
n int No Number of completions per prompt (default: 1)
service_tier "on_demand" / "flex" / "auto" No Service tier for requests (default: "on_demand")
model_kwargs dict No Additional parameters passed to the API

Outputs

Name Type Description
ChatResult ChatResult Contains ChatGeneration objects with AIMessage content, tool calls, reasoning content, and usage metadata
ChatGenerationChunk ChatGenerationChunk Streaming chunks containing AIMessageChunk with incremental content

Key Mechanisms

Reasoning Output

The reasoning_format parameter controls how reasoning is returned:

  • parsed: Reasoning is separated into additional_kwargs["reasoning_content"]
  • raw: Reasoning appears within <think> tags in the content
  • hidden: Only the final answer is returned; reasoning is suppressed

Service Tier

The service_tier parameter selects the processing tier:

  • on_demand: Default guaranteed processing
  • flex: Best-effort processing with rapid timeouts
  • auto: Falls back from on-demand to flex when rate limits are exceeded

Vision Support

Vision-capable models (e.g., meta-llama/llama-4-scout-17b-16e-instruct) accept image content blocks. The _format_message_content helper converts LangChain data content blocks to Groq's image_url format using convert_to_openai_data_block.

Null Tool Arguments Handling

Groq sends JSON null for tools with no arguments, but LangChain expects "{}". Both _convert_dict_to_message and _convert_chunk_to_message_chunk normalize this automatically.

Usage Metadata

The _create_usage_metadata function creates UsageMetadata with support for both Responses API format (input_tokens) and Chat Completions format (prompt_tokens), including token detail breakdowns for cached/reasoning tokens.

Streaming Guard

The _should_stream override disables streaming when response_format is set to json_schema or json_object, as Groq does not support streaming in these modes.

Usage Examples

Basic Usage

from langchain_groq import ChatGroq

model = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.0,
    max_retries=2,
)

messages = [
    ("system", "You are a helpful translator. Translate to French."),
    ("human", "I love programming."),
]
response = model.invoke(messages)
print(response.content)

Tool Calling

from pydantic import BaseModel, Field
from langchain_groq import ChatGroq

class GetWeather(BaseModel):
    """Get the current weather in a given location."""
    location: str = Field(description="City and state, e.g. San Francisco, CA")

model = ChatGroq(model="llama-3.1-8b-instant")
model_with_tools = model.bind_tools([GetWeather])
ai_msg = model_with_tools.invoke("What's the weather in NYC?")
print(ai_msg.tool_calls)

Structured Output

from pydantic import BaseModel, Field
from langchain_groq import ChatGroq

class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")

model = ChatGroq(model="openai/gpt-oss-120b", temperature=0)
structured_model = model.with_structured_output(Joke)
result = structured_model.invoke("Tell me a joke about cats")
print(result)

Reasoning Output

from langchain_groq import ChatGroq

model = ChatGroq(
    model="deepseek-r1-distill-llama-70b",
    reasoning_format="parsed",
)
response = model.invoke("What is 25 * 37?")
print(response.additional_kwargs.get("reasoning_content"))
print(response.content)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment