Implementation:Google deepmind Dm control Variation Broadcaster

Knowledge Sources	Google_deepmind_Dm_control
Domains	Reinforcement_Learning, Domain_Randomization
Last Updated	2026-02-15 04:00 GMT

Overview

A broadcaster mechanism that enables a single variation value to be shared consistently across multiple consumers within the same evaluation round, ensuring correlated randomization.

Description

The VariationBroadcaster class solves the problem of correlating randomized values across multiple attributes or entities. It wraps a Variation object and generates proxy Variation instances via get_proxy(). Each proxy can be used in place of the wrapped variation wherever a Variation object is expected.

The broadcaster operates in rounds. At the beginning of each round (triggered when any proxy requests a value and its cache is empty), the broadcaster evaluates the wrapped variation once and caches the result in deques associated with all proxies. When a proxy is called, it pops the cached value from its deque, ensuring all proxies receive the same sampled value within a round. The round ends when all proxies have been called exactly once.

Proxies are tracked using a WeakKeyDictionary, allowing them to be garbage collected naturally when no longer referenced. The _BroadcastedValueProxy class is a private Variation subclass that delegates value retrieval to its parent broadcaster.

Usage

Use VariationBroadcaster when multiple attributes or entities must receive the same random value. For example, ensuring that a target position and a gripper start position are drawn from the same random sample, or that multiple identical objects receive the same color randomization. Create the broadcaster with the shared variation, then call get_proxy() for each consumer.

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/composer/variation/variation_broadcaster.py
Lines: 1-66

Signature

class VariationBroadcaster:
    def __init__(self, wrapped_variation: variation.Variation):
        ...
    def get_proxy(self) -> variation.Variation:
        ...

class _BroadcastedValueProxy(variation.Variation):
    def __init__(self, broadcaster):
        ...
    def __call__(self, initial_value=None, current_value=None, random_state=None):
        ...

Import

from dm_control.composer.variation.variation_broadcaster import VariationBroadcaster

I/O Contract

Inputs (VariationBroadcaster)

Name	Type	Required	Description
wrapped_variation	`variation.Variation`	Yes	The variation whose values will be broadcast to all proxies

Inputs (get_proxy)

Name	Type	Required	Description
(no args)	--	--	Creates and returns a new proxy variation

Outputs

Name	Type	Description
get_proxy return	`variation.Variation`	A proxy that returns the same value as all other proxies from this broadcaster per round
proxy __call__ return	any	The cached value from the broadcaster, same as the wrapped variation would produce

Internal Mechanism

The broadcaster uses the following data structures:

_cached_values: A WeakKeyDictionary mapping each _BroadcastedValueProxy to a collections.deque of cached values.
When a proxy calls _get_value and its deque is empty, the broadcaster evaluates the wrapped variation and appends the new value to all proxy deques.
Each proxy then pops from its own deque, ensuring all proxies receive the same value per round.
The use of weak references allows proxies to be garbage collected when no longer in use.

Usage Examples

from dm_control.composer.variation.variation_broadcaster import VariationBroadcaster
from dm_control.composer.variation import distributions
import numpy as np

rng = np.random.RandomState(42)

# Create a broadcaster for a shared position variation
position_variation = distributions.Uniform(low=-1.0, high=1.0)
broadcaster = VariationBroadcaster(position_variation)

# Create two proxies for different consumers
proxy_for_target = broadcaster.get_proxy()
proxy_for_gripper = broadcaster.get_proxy()

# Both proxies return the same sampled value
target_pos = proxy_for_target(random_state=rng)
gripper_pos = proxy_for_gripper(random_state=rng)
assert target_pos == gripper_pos  # Same value from the same round

# Next round: a new value is sampled
target_pos_2 = proxy_for_target(random_state=rng)
gripper_pos_2 = proxy_for_gripper(random_state=rng)
assert target_pos_2 == gripper_pos_2  # Same again, but different from round 1

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment