Workflow:Vespa engine Vespa Config subscription lifecycle
| Knowledge Sources | |
|---|---|
| Domains | Configuration, Distributed_Systems, Infrastructure |
| Last Updated | 2026-02-09 12:00 GMT |
Overview
End-to-end process for subscribing to, fetching, and dynamically updating configuration in Vespa processes using the C++ config subscription library with FRT transport.
Description
This workflow describes how Vespa processes obtain and maintain their configuration through the config subscription system. The system uses a subscribe-poll-update model where processes subscribe to typed configuration via ConfigSubscriber, receive updates through the FRT (Fast RPC Transport) protocol from config servers, and synchronize multiple subscriptions to a consistent generation. The architecture supports failover across multiple config servers, exponential backoff on failures, and runtime reconfiguration without process restart.
Usage
Execute this workflow when implementing or understanding how any Vespa process (content nodes, containers, config proxy) obtains its runtime configuration. This is the fundamental mechanism by which all Vespa services adapt to application deployments, cluster topology changes, and operational parameter updates. Use this when you need to subscribe to one or more config types and react to configuration changes at runtime.
Execution Steps
Step 1: Subscription registration
Create a ConfigSubscriber instance connected to the config server specification, then register subscriptions for each required config type. Each call to subscribe returns a typed ConfigHandle that will later provide access to the deserialized config object. Internally, each subscription creates a ConfigSubscription with a ConfigKey (identifying the config type and ID) and an IConfigHolder (thread-safe container for updates).
Key considerations:
- Multiple config types can be subscribed through a single ConfigSubscriber
- Each subscription gets its own ConfigHandle for type-safe access
- The subscriber tracks all subscriptions in a ConfigSubscriptionSet
- Subscription must complete before the first call to nextConfig
Step 2: Initial configuration fetch
Call nextConfig with a timeout to block until the first complete set of configurations arrives from the config server. This transitions the subscriber from OPEN to FROZEN state. The method polls all subscriptions, waiting until every subscription has received at least one configuration response at the same generation number. The FRT transport layer handles the RPC communication with config servers.
Key considerations:
- All subscriptions must reach the same generation before returning
- The timeout applies to the entire synchronization process
- Polling uses 20ms sleep intervals between checks
- State transitions: OPEN to FROZEN to CONFIGURED on success
Step 3: Configuration application
Once nextConfig returns true, read the current configuration from each ConfigHandle. Call isChanged on each handle to determine which configs were modified in this generation. Call getConfig to obtain the deserialized config object of the appropriate type. The application then applies the configuration to its runtime state.
Key considerations:
- getConfig returns a unique_ptr to the typed config object
- isChanged indicates whether this specific config changed since the last generation
- Config objects are immutable once retrieved
- The application thread is the only thread that should call getConfig
Step 4: Change polling loop
Enter a continuous loop calling nextGeneration (or nextConfig) with a moderate timeout to poll for configuration changes. When the config server deploys a new application, all subscriptions receive updated configs at a new generation number. The method returns true when a new generation is available with all subscriptions synchronized.
Key considerations:
- nextGeneration returns true even if no individual configs changed (generation still advances)
- nextConfig returns true only if at least one config actually changed
- The FRT source schedules periodic requests to the config server
- Background FRT threads deliver responses asynchronously to the IConfigHolder
Step 5: FRT transport and failover
The FRT transport layer manages RPC connections to config servers through a connection pool. Requests are distributed across available servers using round-robin or hash-based selection. When a server fails, it is moved to a suspended list and requests route to healthy servers. Failed connections trigger exponential backoff with increasing wait times before retry.
Key considerations:
- FRTConnectionPool separates healthy (error-free) and suspended (errored) connections
- When all healthy servers are exhausted, suspended servers are retried
- The FRTConfigAgent tracks consecutive failures and adjusts wait times
- Each FRTSource manages its own request lifecycle with inflight tracking
Step 6: Subscriber shutdown
Close the ConfigSubscriber to release all resources. This transitions the state to CLOSED, cancels any pending FRT requests, and releases the memory-mapped control structures. After closing, no further config reads or polls are possible.
Key considerations:
- Close is idempotent and safe to call multiple times
- Pending RPC requests are cancelled during shutdown
- The FRTSource transitions through CLOSING to CLOSED, waiting for inflight requests
- ConfigRetriever (advanced API) manages separate bootstrap and component subscriptions