Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bigscience workshop Petals Choose Best Blocks

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, Load_Balancing, Optimization
Last Updated 2026-02-09 14:00 GMT

Overview

Concrete tool for selecting optimal transformer blocks for a server to host based on current swarm state, provided by Petals' block selection module.

Description

The block selection module provides two key functions:

choose_best_blocks(): Selects the optimal contiguous block range for a new server by:

  1. Computing per-block throughput via compute_throughputs()
  2. Using _choose_best_start() to find the contiguous range with minimum aggregate throughput
  3. Returning a list of block indices

should_choose_other_blocks(): Evaluates whether an existing server should rebalance by:

  1. Computing current per-block throughputs
  2. Simulating removal of the local server's span
  3. Checking if the new best position differs from the current position
  4. Comparing the balance quality ratio against the threshold

Usage

Called automatically during Server.__init__ (block selection) and Server.run (periodic rebalancing checks). Not typically called directly by users.

Code Reference

Source Location

  • Repository: petals
  • File: src/petals/server/block_selection.py (L12-95)

Signature

def compute_throughputs(
    spans: Dict[PeerID, RemoteSpanInfo],
    *,
    total_blocks: int,
) -> np.ndarray:
    """
    Compute per-block throughput by summing contributions from all serving peers.

    Returns:
        np.ndarray of shape [total_blocks] with aggregate throughput per block
    """

def choose_best_blocks(
    num_blocks: int,
    module_infos: List[RemoteModuleInfo],
) -> List[int]:
    """
    Select the contiguous range of blocks with lowest aggregate throughput.

    Args:
        num_blocks: Number of blocks this server can host
        module_infos: Current swarm state from DHT
    Returns:
        List of block indices to serve (e.g. [0, 1, 2, ..., 17])
    """

def should_choose_other_blocks(
    local_peer_id: PeerID,
    module_infos: List[RemoteModuleInfo],
    balance_quality: float,
) -> bool:
    """
    Check if this server should rebalance to different blocks.

    Args:
        local_peer_id: This server's peer ID
        module_infos: Current swarm state from DHT
        balance_quality: Threshold ratio (default 0.75)
    Returns:
        True if rebalancing would improve swarm throughput
    """

Import

from petals.server.block_selection import choose_best_blocks, should_choose_other_blocks

I/O Contract

Inputs

Name Type Required Description
num_blocks int Yes Number of contiguous blocks this server can host
module_infos List[RemoteModuleInfo] Yes Current swarm state listing all servers and their block spans
local_peer_id PeerID Yes (rebalance) This server's peer ID for rebalancing check
balance_quality float No Threshold for rebalancing (default 0.75)

Outputs

Name Type Description
choose_best_blocks returns List[int] Optimal contiguous block indices (e.g. [0, 1, 2, ..., 17])
should_choose_other_blocks returns bool True if server should rebalance to different blocks

Usage Examples

Block Selection During Server Init

from petals.server.block_selection import choose_best_blocks
from petals.utils.dht import get_remote_module_infos

# Query current swarm state from DHT
module_infos = get_remote_module_infos(dht, dht_prefix, block_config.num_hidden_layers)

# Select optimal blocks
block_indices = choose_best_blocks(num_blocks=8, module_infos=module_infos)
print(f"Selected blocks: {block_indices}")  # e.g. [24, 25, 26, 27, 28, 29, 30, 31]

Periodic Rebalancing Check

from petals.server.block_selection import should_choose_other_blocks

# During server.run() main loop
if should_choose_other_blocks(local_peer_id, module_infos, balance_quality=0.75):
    # Shutdown current blocks and select new ones
    server.shutdown()
    new_blocks = choose_best_blocks(num_blocks, module_infos)
    # Restart with new blocks

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment