Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Predibase Lorax Adapter Scheduler Next Batch

From Leeroopedia


Knowledge Sources
Domains Inference_Optimization, Scheduling
Last Updated 2026-02-08 02:00 GMT

Overview

Concrete tool for adapter-aware continuous batch scheduling provided by the LoRAX AdapterScheduler in the Rust router.

Description

The AdapterScheduler manages the request queue and batch formation in the LoRAX router. Its process() method enqueues incoming requests, and next_batch() forms optimal batches considering token budgets and active adapter sets. The scheduling logic in AdapterSchedulerState::next_batch() groups requests by adapter and respects prefill/decode token limits.

Usage

Used internally by the router's batching task. Not called directly by end users.

Code Reference

Source Location

  • Repository: LoRAX
  • File: router/src/scheduler.rs
  • Lines: 26-546

Signature

pub(crate) struct AdapterScheduler {
    sender: mpsc::UnboundedSender<SchedulerCommand>,
}

impl AdapterScheduler {
    pub(crate) fn new(/* ... */) -> Self;

    pub(crate) fn process(
        &self,
        adapter: Adapter,
        entry: Entry,
    );

    pub(crate) async fn next_batch(
        &self,
        adapters_in_use: HashSet<Adapter>,
        min_size: Option<usize>,
        prefill_token_budget: u32,
        token_budget: u32,
    ) -> Option<NextBatch>;
}

Import

use crate::scheduler::AdapterScheduler;

I/O Contract

Inputs

Name Type Required Description
adapter Adapter Yes Adapter identifier for the request
entry Entry Yes Request entry with prompt and parameters
adapters_in_use HashSet[Adapter] Yes Currently active adapters in batch
prefill_token_budget u32 Yes Max tokens for prefill phase
token_budget u32 Yes Max total tokens in batch

Outputs

Name Type Description
next_batch Option[NextBatch] Batch of entries to process, or None if queue empty

Usage Examples

Router Batching Task

// In router/src/infer.rs - batching_task
loop {
    // Form next batch from queued requests
    let batch = scheduler.next_batch(
        adapters_in_use,
        min_size,
        prefill_token_budget,
        token_budget,
    ).await;

    if let Some(next_batch) = batch {
        // Send batch to shards for prefill
        let result = client.prefill(next_batch.batch).await;
        // Process results...
    }
}

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment