Implementation:Langchain ai Langchain FireworksLLM

Knowledge Sources	Langchain_ai_Langchain
Domains	LLM, Fireworks AI, Text Completion
Last Updated	2026-02-11 00:00 GMT

Overview

Fireworks is a LangChain LLM wrapper around Fireworks AI's text completion API, supporting both synchronous and asynchronous text generation.

Description

The Fireworks class extends LLM from langchain_core to provide text completion capabilities using the Fireworks AI platform. It communicates directly with the Fireworks REST API at https://api.fireworks.ai/inference/v1/completions using HTTP requests (synchronous via requests, asynchronous via aiohttp). The class supports standard LLM parameters including temperature, top_p, top_k, max_tokens, repetition_penalty, and logprobs. It handles API error responses with appropriate error types for server errors (5xx), client errors (4xx), and unexpected status codes.

Usage

Import this class when you need a text completion LLM backed by the Fireworks AI API for use in LangChain chains, agents, or direct text generation.

Code Reference

Source Location

Repository: Langchain_ai_Langchain
File: libs/partners/fireworks/langchain_fireworks/llms.py
Lines: 1-234

Signature

class Fireworks(LLM):
    base_url: str = "https://api.fireworks.ai/inference/v1/completions"
    fireworks_api_key: SecretStr = Field(alias="api_key", ...)
    model: str
    temperature: float | None = None
    top_p: float | None = None
    model_kwargs: dict[str, Any] = Field(default_factory=dict)
    top_k: int | None = None
    max_tokens: int | None = None
    repetition_penalty: float | None = None
    logprobs: int | None = None
    timeout: int | None = 30

    def _call(
        self,
        prompt: str,
        stop: list[str] | None = None,
        run_manager: CallbackManagerForLLMRun | None = None,
        **kwargs: Any,
    ) -> str: ...

    async def _acall(
        self,
        prompt: str,
        stop: list[str] | None = None,
        run_manager: AsyncCallbackManagerForLLMRun | None = None,
        **kwargs: Any,
    ) -> str: ...

Import

from langchain_fireworks import Fireworks

I/O Contract

Inputs

Name	Type	Required	Description
model	`str`	Yes	Model name to use for completion.
fireworks_api_key	`SecretStr`	Yes	Fireworks API key. Read from `FIREWORKS_API_KEY` env var if not provided.
temperature	None	No	Model temperature for generation.
top_p	None	No	Nucleus sampling parameter.
top_k	None	No	Limits the number of token choices at each step.
max_tokens	None	No	Maximum number of tokens to generate.
repetition_penalty	None	No	Controls diversity by reducing likelihood of repeated sequences.
logprobs	None	No	Number of top token log probabilities to include per step.
timeout	None	No	Timeout in seconds for API requests. Default: 30.
prompt	`str`	Yes (for `_call`/`_acall`)	The prompt text to complete.
stop	None	No	Stop sequences for generation.

Outputs

Name	Type	Description
_call return	`str`	The generated text from the completion API.
_acall return	`str`	Async variant returning the generated text.

Properties

Property	Return Type	Description
`_llm_type`	`str`	Returns `"fireworks"`.
`default_params`	`dict[str, Any]`	Returns the default parameter dictionary including model, temperature, top_p, top_k, max_tokens, and repetition_penalty.

Usage Examples

Basic Usage

from langchain_fireworks import Fireworks

llm = Fireworks(
    model="accounts/fireworks/models/llama-v3p1-8b-instruct",
    temperature=0.7,
    max_tokens=256,
)

# Synchronous generation
response = llm.invoke("Tell me a joke.")
print(response)

# Batch generation
responses = llm.generate(["Tell me a joke.", "Write a haiku."])

Related Pages

Requires langchain-fireworks, requests, and aiohttp packages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment