Implementation:Isaac sim IsaacGymEnvs PBT Mutation

Knowledge Sources	IsaacGymEnvs
Domains	Hyperparameter_Optimization, Evolutionary_Computing
Last Updated	2026-02-15 11:00 GMT

Overview

PBT_Mutation provides a collection of hyperparameter mutation operators used in Population-Based Training (PBT) to perturb and evolve training parameters during the search process.

Description

This module implements several specialized mutation functions for different types of hyperparameters, following the PBT paradigm where underperforming agents inherit parameters from better-performing ones and then mutate those parameters to explore the hyperparameter space.

The core mutation function is mutate_float(x, change_min, change_max), which randomly multiplies or divides the input value by a perturbation factor uniformly sampled from [change_min, change_max]. This provides symmetric multiplicative noise that respects the scale of the parameter. Built on top of this, mutate_float_min_1() applies the same mutation but clamps the result to a minimum of 1.0, suitable for parameters that must remain at least 1. mutate_eps_clip() clamps the mutated value to the range [0.01, 0.3], appropriate for PPO's epsilon clipping parameter. mutate_mini_epochs() uses additive integer perturbation (plus or minus 1) clamped to [1, 8], designed for the number of PPO mini-epochs. mutate_discount() applies a conservative mutation in the inverted space (1 - gamma) to avoid large changes in the discount factor, using a narrower perturbation range of [1.1, 1.2].

The get_mutation_func(mutation_func_name) utility resolves a mutation function by name using eval(), and the top-level mutate(params, mutations, mutation_rate, pbt_change_min, pbt_change_max) function applies mutations to an entire parameter dictionary. For each parameter, it flips a coin based on mutation_rate to decide whether to mutate, looks up the corresponding mutation function from the mutations mapping, and applies it with the specified change bounds.

Usage

Use these mutation functions when implementing PBT for reinforcement learning training. The mutate() function is called during the exploit-and-explore phase of PBT, after an underperforming agent has copied weights from a better agent. The mutation operators ensure that each agent explores slightly different hyperparameter configurations, enabling the population to discover effective parameter settings over time.

Code Reference

Source Location

Repository: IsaacGymEnvs
File: isaacgymenvs/pbt/mutation.py
Lines: 1-97

Signature

def mutate_float(x, change_min=1.1, change_max=1.5):
def mutate_float_min_1(x, **kwargs):
def mutate_eps_clip(x, **kwargs):
def mutate_mini_epochs(x, **kwargs):
def mutate_discount(x, **kwargs):
def get_mutation_func(mutation_func_name):
def mutate(params, mutations, mutation_rate, pbt_change_min, pbt_change_max):

Import

from isaacgymenvs.pbt.mutation import mutate, mutate_float

I/O Contract

Inputs

Name	Type	Required	Description
params	dict[str, float]	Yes	Dictionary of hyperparameter names to their current values
mutations	dict[str, str]	Yes	Dictionary mapping parameter names to mutation function names (e.g., `"mutate_float"`)
mutation_rate	float	Yes	Probability of mutating each individual parameter (0.0 to 1.0)
pbt_change_min	float	Yes	Minimum perturbation factor passed to mutation functions as `change_min`
pbt_change_max	float	Yes	Maximum perturbation factor passed to mutation functions as `change_max`

Outputs

Name	Type	Description
mutated_params	dict[str, float]	Deep copy of the input params with selected values mutated according to their assigned mutation functions

Usage Examples

from isaacgymenvs.pbt.mutation import mutate, mutate_float, mutate_discount

# Define current hyperparameters
params = {
    'learning_rate': 3e-4,
    'gamma': 0.99,
    'e_clip': 0.2,
    'mini_epochs': 4,
    'tau': 0.95,
}

# Define which mutation function to use for each parameter
mutations = {
    'learning_rate': 'mutate_float',
    'gamma': 'mutate_discount',
    'e_clip': 'mutate_eps_clip',
    'mini_epochs': 'mutate_mini_epochs',
    'tau': 'mutate_float',
}

# Apply mutations with 80% mutation rate
mutated_params = mutate(
    params=params,
    mutations=mutations,
    mutation_rate=0.8,
    pbt_change_min=1.1,
    pbt_change_max=1.5,
)

print(f"Original LR: {params['learning_rate']}")
print(f"Mutated LR:  {mutated_params['learning_rate']}")

# Use individual mutation functions directly
new_lr = mutate_float(3e-4, change_min=1.1, change_max=1.5)
new_gamma = mutate_discount(0.99)

Related Pages

Isaac_sim_IsaacGymEnvs_PBT_Launcher - The launcher that orchestrates PBT experiments using these mutation operators
Isaac_sim_IsaacGymEnvs_RunDescription - Defines experiment configurations whose hyperparameters are mutated

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment