Implementation:Bigscience workshop Petals Server Run
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Distributed_Computing, Infrastructure, Monitoring |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
Concrete tool for running the Petals server main loop with health monitoring and automatic rebalancing, provided by the Petals server module.
Description
Server.run() is the main event loop that:
- Creates and starts the initial ModuleContainer via ModuleContainer.create()
- Enters an infinite loop checking container health and swarm balance
- On rebalancing trigger: shuts down the current container, selects new blocks, creates a new container
- On KeyboardInterrupt: calls Server.shutdown() for graceful cleanup
The loop also calls Server._should_choose_other_blocks() which wraps should_choose_other_blocks() with the randomized check interval and delay.
Usage
Called by main() in the CLI after Server.__init__ completes. This method blocks until the server is shut down.
Code Reference
Source Location
- Repository: petals
- File: src/petals/server/server.py (L328-384, Server.run)
- File: src/petals/server/server.py (L413-418, Server._should_choose_other_blocks)
- File: src/petals/server/server.py (L420-428, Server.shutdown)
Signature
class Server:
def run(self) -> None:
"""
Main server loop: start serving, monitor health, rebalance as needed.
Creates ModuleContainer, then loops:
1. Check container health via is_healthy()
2. Periodically check swarm balance via _should_choose_other_blocks()
3. If unhealthy or imbalanced: shutdown, re-select blocks, restart
Blocks until KeyboardInterrupt triggers graceful shutdown.
"""
def _should_choose_other_blocks(self) -> bool:
"""
Check if rebalancing is needed (randomized interval).
Wraps should_choose_other_blocks with timing logic.
"""
def shutdown(self, timeout: Optional[float] = 5) -> None:
"""
Graceful server shutdown.
Stops ModuleContainer, de-registers blocks from DHT,
waits for in-flight requests to complete.
"""
Import
from petals.server.server import Server
server = Server(...)
server.run() # Blocks until shutdown
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| self | Server | Yes | Fully configured Server instance from __init__ |
Outputs
| Name | Type | Description |
|---|---|---|
| (blocking) | None | Method runs until KeyboardInterrupt or fatal error |
| DHT state | dict | Blocks de-registered (OFFLINE) on shutdown |
Usage Examples
Full Server Lifecycle
from petals.server.server import Server
from petals.constants import PUBLIC_INITIAL_PEERS
server = Server(
initial_peers=PUBLIC_INITIAL_PEERS,
dht_prefix=None,
converted_model_name_or_path="petals-team/StableBeluga2",
throughput="auto",
)
try:
server.run()
# Server is now:
# 1. Serving transformer blocks via RPC
# 2. Announcing ONLINE status in DHT every update_period
# 3. Checking swarm balance every ~120s
# 4. Rebalancing if needed
except KeyboardInterrupt:
pass # server.run() handles graceful shutdown internally
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment