Implementation:Google deepmind Dm control Variation Broadcaster
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Domain_Randomization |
| Last Updated | 2026-02-15 04:00 GMT |
Overview
A broadcaster mechanism that enables a single variation value to be shared consistently across multiple consumers within the same evaluation round, ensuring correlated randomization.
Description
The VariationBroadcaster class solves the problem of correlating randomized values across multiple attributes or entities. It wraps a Variation object and generates proxy Variation instances via get_proxy(). Each proxy can be used in place of the wrapped variation wherever a Variation object is expected.
The broadcaster operates in rounds. At the beginning of each round (triggered when any proxy requests a value and its cache is empty), the broadcaster evaluates the wrapped variation once and caches the result in deques associated with all proxies. When a proxy is called, it pops the cached value from its deque, ensuring all proxies receive the same sampled value within a round. The round ends when all proxies have been called exactly once.
Proxies are tracked using a WeakKeyDictionary, allowing them to be garbage collected naturally when no longer referenced. The _BroadcastedValueProxy class is a private Variation subclass that delegates value retrieval to its parent broadcaster.
Usage
Use VariationBroadcaster when multiple attributes or entities must receive the same random value. For example, ensuring that a target position and a gripper start position are drawn from the same random sample, or that multiple identical objects receive the same color randomization. Create the broadcaster with the shared variation, then call get_proxy() for each consumer.
Code Reference
Source Location
- Repository: Google_deepmind_Dm_control
- File: dm_control/composer/variation/variation_broadcaster.py
- Lines: 1-66
Signature
class VariationBroadcaster:
def __init__(self, wrapped_variation: variation.Variation):
...
def get_proxy(self) -> variation.Variation:
...
class _BroadcastedValueProxy(variation.Variation):
def __init__(self, broadcaster):
...
def __call__(self, initial_value=None, current_value=None, random_state=None):
...
Import
from dm_control.composer.variation.variation_broadcaster import VariationBroadcaster
I/O Contract
Inputs (VariationBroadcaster)
| Name | Type | Required | Description |
|---|---|---|---|
| wrapped_variation | variation.Variation |
Yes | The variation whose values will be broadcast to all proxies |
Inputs (get_proxy)
| Name | Type | Required | Description |
|---|---|---|---|
| (no args) | -- | -- | Creates and returns a new proxy variation |
Outputs
| Name | Type | Description |
|---|---|---|
| get_proxy return | variation.Variation |
A proxy that returns the same value as all other proxies from this broadcaster per round |
| proxy __call__ return | any | The cached value from the broadcaster, same as the wrapped variation would produce |
Internal Mechanism
The broadcaster uses the following data structures:
_cached_values: AWeakKeyDictionarymapping each_BroadcastedValueProxyto acollections.dequeof cached values.- When a proxy calls
_get_valueand its deque is empty, the broadcaster evaluates the wrapped variation and appends the new value to all proxy deques. - Each proxy then pops from its own deque, ensuring all proxies receive the same value per round.
- The use of weak references allows proxies to be garbage collected when no longer in use.
Usage Examples
from dm_control.composer.variation.variation_broadcaster import VariationBroadcaster
from dm_control.composer.variation import distributions
import numpy as np
rng = np.random.RandomState(42)
# Create a broadcaster for a shared position variation
position_variation = distributions.Uniform(low=-1.0, high=1.0)
broadcaster = VariationBroadcaster(position_variation)
# Create two proxies for different consumers
proxy_for_target = broadcaster.get_proxy()
proxy_for_gripper = broadcaster.get_proxy()
# Both proxies return the same sampled value
target_pos = proxy_for_target(random_state=rng)
gripper_pos = proxy_for_gripper(random_state=rng)
assert target_pos == gripper_pos # Same value from the same round
# Next round: a new value is sampled
target_pos_2 = proxy_for_target(random_state=rng)
gripper_pos_2 = proxy_for_gripper(random_state=rng)
assert target_pos_2 == gripper_pos_2 # Same again, but different from round 1