Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Bigscience workshop Petals Block Selection

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, Load_Balancing, Optimization
Last Updated 2026-02-09 14:00 GMT

Overview

A greedy algorithm for selecting which contiguous range of transformer blocks a server should host, optimizing to fill gaps in the network's block coverage and maximize overall swarm throughput.

Description

Block Selection solves the decentralized load balancing problem in Petals: given a swarm of volunteer servers each hosting different block ranges, which blocks should a new server host to maximize the network's end-to-end throughput?

Algorithm:

  1. Query DHT: Retrieve RemoteModuleInfo for all blocks, listing which servers host each block and their throughput
  2. Compute per-block throughput: Sum the throughput of all servers covering each block
  3. Find the bottleneck: Identify the contiguous range of num_blocks blocks with the lowest aggregate throughput
  4. Select that range: The new server fills the gap where it is most needed

Rebalancing: Existing servers periodically check if the swarm would benefit from them switching blocks via should_choose_other_blocks(), which compares the actual throughput distribution to the optimal (uniform) distribution using a balance_quality threshold.

Usage

This principle is used automatically when a server starts and during periodic rebalancing checks. Server operators can override it by specifying --block_indices explicitly.

Theoretical Basis

Optimal throughput is bottleneck-limited:

End-to-end throughput of the distributed model is limited by the block with the least total serving capacity:

Tendtoend=minisservers(i)throughput(s)

Greedy allocation:

# Abstract block selection algorithm
def choose_best_blocks(num_blocks, module_infos):
    throughputs = compute_per_block_throughputs(module_infos)
    # Find contiguous range of num_blocks with minimum total throughput
    best_start = argmin(
        sum(throughputs[i:i+num_blocks]) for i in range(total_blocks - num_blocks + 1)
    )
    return list(range(best_start, best_start + num_blocks))

Balance quality metric: quality=minithroughput(i)meanithroughput(i)

If quality < balance_quality threshold (default 0.75), servers should consider rebalancing.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment