Implementation:Predibase Lorax Parameters Request Types
| Knowledge Sources | |
|---|---|
| Domains | API_Design, Text_Generation |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
Concrete tool for constructing validated inference requests provided by the LoRAX Python client types module.
Description
The Parameters and Request Pydantic models define the complete inference request schema. Parameters validates adapter selection (adapter_id vs merged_adapters mutual exclusion), sampling parameters (temperature, top_k, top_p), and output constraints. Request wraps the prompt text with parameters and streaming flag.
Usage
Used implicitly when calling Client.generate() or Client.generate_stream(). Can also be constructed manually for batch requests.
Code Reference
Source Location
- Repository: LoRAX
- File: clients/python/lorax/types.py
- Lines: 74-236
Signature
class Parameters(BaseModel):
adapter_id: Optional[str] = None
adapter_source: Optional[str] = None
merged_adapters: Optional[MergedAdapters] = None
api_token: Optional[str] = None
do_sample: bool = False
max_new_tokens: Optional[int] = None
ignore_eos_token: bool = False
repetition_penalty: Optional[float] = None
return_full_text: bool = False
stop: List[str] = []
seed: Optional[int] = None
temperature: Optional[float] = None
top_k: Optional[int] = None
top_p: Optional[float] = None
truncate: Optional[int] = None
typical_p: Optional[float] = None
best_of: Optional[int] = None
watermark: bool = False
details: bool = False
decoder_input_details: bool = False
return_k_alternatives: Optional[int] = None
response_format: Optional[ResponseFormat] = None
class Request(BaseModel):
inputs: str
parameters: Optional[Parameters] = None
stream: bool = False
Import
from lorax.types import Parameters, Request
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| inputs | str | Yes | Prompt text (cannot be empty) |
| adapter_id | Optional[str] | No | LoRA adapter HuggingFace ID |
| adapter_source | Optional[str] | No | Source: "hub", "local", "s3", "pbase" |
| merged_adapters | Optional[MergedAdapters] | No | Multi-adapter merge config |
| do_sample | bool | No | Enable sampling (default False) |
| max_new_tokens | Optional[int] | No | Max generated tokens |
| temperature | Optional[float] | No | Sampling temperature (>= 0) |
| top_k | Optional[int] | No | Top-k filtering (> 0) |
| top_p | Optional[float] | No | Nucleus sampling (0 < p < 1) |
| response_format | Optional[ResponseFormat] | No | JSON schema constraint |
Outputs
| Name | Type | Description |
|---|---|---|
| request | Request | Validated request object for HTTP POST |
Usage Examples
Basic Request
from lorax import Client
client = Client("http://localhost:3000")
# Simple generation with adapter
response = client.generate(
"What is machine learning?",
adapter_id="my-org/my-lora-adapter",
max_new_tokens=100,
)
print(response.generated_text)
Sampling with Parameters
response = client.generate(
"Write a poem about AI:",
adapter_id="my-org/creative-writing-lora",
do_sample=True,
temperature=0.7,
top_p=0.9,
max_new_tokens=200,
details=True,
)
print(f"Tokens: {response.details.generated_tokens}")