Implementation:Predibase Lorax Adapter Scheduler Next Batch
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Inference_Optimization, Scheduling |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
Concrete tool for adapter-aware continuous batch scheduling provided by the LoRAX AdapterScheduler in the Rust router.
Description
The AdapterScheduler manages the request queue and batch formation in the LoRAX router. Its process() method enqueues incoming requests, and next_batch() forms optimal batches considering token budgets and active adapter sets. The scheduling logic in AdapterSchedulerState::next_batch() groups requests by adapter and respects prefill/decode token limits.
Usage
Used internally by the router's batching task. Not called directly by end users.
Code Reference
Source Location
- Repository: LoRAX
- File: router/src/scheduler.rs
- Lines: 26-546
Signature
pub(crate) struct AdapterScheduler {
sender: mpsc::UnboundedSender<SchedulerCommand>,
}
impl AdapterScheduler {
pub(crate) fn new(/* ... */) -> Self;
pub(crate) fn process(
&self,
adapter: Adapter,
entry: Entry,
);
pub(crate) async fn next_batch(
&self,
adapters_in_use: HashSet<Adapter>,
min_size: Option<usize>,
prefill_token_budget: u32,
token_budget: u32,
) -> Option<NextBatch>;
}
Import
use crate::scheduler::AdapterScheduler;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| adapter | Adapter | Yes | Adapter identifier for the request |
| entry | Entry | Yes | Request entry with prompt and parameters |
| adapters_in_use | HashSet[Adapter] | Yes | Currently active adapters in batch |
| prefill_token_budget | u32 | Yes | Max tokens for prefill phase |
| token_budget | u32 | Yes | Max total tokens in batch |
Outputs
| Name | Type | Description |
|---|---|---|
| next_batch | Option[NextBatch] | Batch of entries to process, or None if queue empty |
Usage Examples
Router Batching Task
// In router/src/infer.rs - batching_task
loop {
// Form next batch from queued requests
let batch = scheduler.next_batch(
adapters_in_use,
min_size,
prefill_token_budget,
token_budget,
).await;
if let Some(next_batch) = batch {
// Send batch to shards for prefill
let result = client.prefill(next_batch.batch).await;
// Process results...
}
}
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment