Overview
Fireworks is a LangChain LLM wrapper around Fireworks AI's text completion API, supporting both synchronous and asynchronous text generation.
Description
The Fireworks class extends LLM from langchain_core to provide text completion capabilities using the Fireworks AI platform. It communicates directly with the Fireworks REST API at https://api.fireworks.ai/inference/v1/completions using HTTP requests (synchronous via requests, asynchronous via aiohttp). The class supports standard LLM parameters including temperature, top_p, top_k, max_tokens, repetition_penalty, and logprobs. It handles API error responses with appropriate error types for server errors (5xx), client errors (4xx), and unexpected status codes.
Usage
Import this class when you need a text completion LLM backed by the Fireworks AI API for use in LangChain chains, agents, or direct text generation.
Code Reference
Source Location
Signature
class Fireworks(LLM):
base_url: str = "https://api.fireworks.ai/inference/v1/completions"
fireworks_api_key: SecretStr = Field(alias="api_key", ...)
model: str
temperature: float | None = None
top_p: float | None = None
model_kwargs: dict[str, Any] = Field(default_factory=dict)
top_k: int | None = None
max_tokens: int | None = None
repetition_penalty: float | None = None
logprobs: int | None = None
timeout: int | None = 30
def _call(
self,
prompt: str,
stop: list[str] | None = None,
run_manager: CallbackManagerForLLMRun | None = None,
**kwargs: Any,
) -> str: ...
async def _acall(
self,
prompt: str,
stop: list[str] | None = None,
run_manager: AsyncCallbackManagerForLLMRun | None = None,
**kwargs: Any,
) -> str: ...
Import
from langchain_fireworks import Fireworks
I/O Contract
Inputs
| Name |
Type |
Required |
Description
|
| model |
str |
Yes |
Model name to use for completion.
|
| fireworks_api_key |
SecretStr |
Yes |
Fireworks API key. Read from FIREWORKS_API_KEY env var if not provided.
|
| temperature |
None |
No |
Model temperature for generation.
|
| top_p |
None |
No |
Nucleus sampling parameter.
|
| top_k |
None |
No |
Limits the number of token choices at each step.
|
| max_tokens |
None |
No |
Maximum number of tokens to generate.
|
| repetition_penalty |
None |
No |
Controls diversity by reducing likelihood of repeated sequences.
|
| logprobs |
None |
No |
Number of top token log probabilities to include per step.
|
| timeout |
None |
No |
Timeout in seconds for API requests. Default: 30.
|
| prompt |
str |
Yes (for _call/_acall) |
The prompt text to complete.
|
| stop |
None |
No |
Stop sequences for generation.
|
Outputs
| Name |
Type |
Description
|
| _call return |
str |
The generated text from the completion API.
|
| _acall return |
str |
Async variant returning the generated text.
|
Properties
| Property |
Return Type |
Description
|
_llm_type |
str |
Returns "fireworks".
|
default_params |
dict[str, Any] |
Returns the default parameter dictionary including model, temperature, top_p, top_k, max_tokens, and repetition_penalty.
|
Usage Examples
Basic Usage
from langchain_fireworks import Fireworks
llm = Fireworks(
model="accounts/fireworks/models/llama-v3p1-8b-instruct",
temperature=0.7,
max_tokens=256,
)
# Synchronous generation
response = llm.invoke("Tell me a joke.")
print(response)
# Batch generation
responses = llm.generate(["Tell me a joke.", "Write a haiku."])
Related Pages
- Requires
langchain-fireworks, requests, and aiohttp packages
Page Connections
Double-click a node to navigate. Hold to expand connections.