Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Isaac sim IsaacGymEnvs PBT Mutation

From Leeroopedia
Knowledge Sources
Domains Hyperparameter_Optimization, Evolutionary_Computing
Last Updated 2026-02-15 11:00 GMT

Overview

PBT_Mutation provides a collection of hyperparameter mutation operators used in Population-Based Training (PBT) to perturb and evolve training parameters during the search process.

Description

This module implements several specialized mutation functions for different types of hyperparameters, following the PBT paradigm where underperforming agents inherit parameters from better-performing ones and then mutate those parameters to explore the hyperparameter space.

The core mutation function is mutate_float(x, change_min, change_max), which randomly multiplies or divides the input value by a perturbation factor uniformly sampled from [change_min, change_max]. This provides symmetric multiplicative noise that respects the scale of the parameter. Built on top of this, mutate_float_min_1() applies the same mutation but clamps the result to a minimum of 1.0, suitable for parameters that must remain at least 1. mutate_eps_clip() clamps the mutated value to the range [0.01, 0.3], appropriate for PPO's epsilon clipping parameter. mutate_mini_epochs() uses additive integer perturbation (plus or minus 1) clamped to [1, 8], designed for the number of PPO mini-epochs. mutate_discount() applies a conservative mutation in the inverted space (1 - gamma) to avoid large changes in the discount factor, using a narrower perturbation range of [1.1, 1.2].

The get_mutation_func(mutation_func_name) utility resolves a mutation function by name using eval(), and the top-level mutate(params, mutations, mutation_rate, pbt_change_min, pbt_change_max) function applies mutations to an entire parameter dictionary. For each parameter, it flips a coin based on mutation_rate to decide whether to mutate, looks up the corresponding mutation function from the mutations mapping, and applies it with the specified change bounds.

Usage

Use these mutation functions when implementing PBT for reinforcement learning training. The mutate() function is called during the exploit-and-explore phase of PBT, after an underperforming agent has copied weights from a better agent. The mutation operators ensure that each agent explores slightly different hyperparameter configurations, enabling the population to discover effective parameter settings over time.

Code Reference

Source Location

Signature

def mutate_float(x, change_min=1.1, change_max=1.5):
def mutate_float_min_1(x, **kwargs):
def mutate_eps_clip(x, **kwargs):
def mutate_mini_epochs(x, **kwargs):
def mutate_discount(x, **kwargs):
def get_mutation_func(mutation_func_name):
def mutate(params, mutations, mutation_rate, pbt_change_min, pbt_change_max):

Import

from isaacgymenvs.pbt.mutation import mutate, mutate_float

I/O Contract

Inputs

Name Type Required Description
params dict[str, float] Yes Dictionary of hyperparameter names to their current values
mutations dict[str, str] Yes Dictionary mapping parameter names to mutation function names (e.g., "mutate_float")
mutation_rate float Yes Probability of mutating each individual parameter (0.0 to 1.0)
pbt_change_min float Yes Minimum perturbation factor passed to mutation functions as change_min
pbt_change_max float Yes Maximum perturbation factor passed to mutation functions as change_max

Outputs

Name Type Description
mutated_params dict[str, float] Deep copy of the input params with selected values mutated according to their assigned mutation functions

Usage Examples

from isaacgymenvs.pbt.mutation import mutate, mutate_float, mutate_discount

# Define current hyperparameters
params = {
    'learning_rate': 3e-4,
    'gamma': 0.99,
    'e_clip': 0.2,
    'mini_epochs': 4,
    'tau': 0.95,
}

# Define which mutation function to use for each parameter
mutations = {
    'learning_rate': 'mutate_float',
    'gamma': 'mutate_discount',
    'e_clip': 'mutate_eps_clip',
    'mini_epochs': 'mutate_mini_epochs',
    'tau': 'mutate_float',
}

# Apply mutations with 80% mutation rate
mutated_params = mutate(
    params=params,
    mutations=mutations,
    mutation_rate=0.8,
    pbt_change_min=1.1,
    pbt_change_max=1.5,
)

print(f"Original LR: {params['learning_rate']}")
print(f"Mutated LR:  {mutated_params['learning_rate']}")

# Use individual mutation functions directly
new_lr = mutate_float(3e-4, change_min=1.1, change_max=1.5)
new_gamma = mutate_discount(0.99)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment