Implementation:Isaac sim IsaacGymEnvs ReplayBuffer
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Data_Management |
| Last Updated | 2026-02-15 11:00 GMT |
Overview
ReplayBuffer is a GPU-resident circular replay buffer designed for efficient storage and sampling of observation data during AMP (Adversarial Motion Priors) training.
Description
The ReplayBuffer class provides a fixed-size circular buffer that stores tensors directly on the GPU device, avoiding costly CPU-GPU data transfers during training. It is primarily used in AMP training to maintain a history of agent-generated AMP observations, which are later replayed through the discriminator alongside current observations and demonstration data.
The buffer uses a head pointer (_head) that advances as new data is stored, wrapping around when it reaches the end of the buffer. The store(data_dict) method accepts a dictionary of tensors and writes them into the buffer, handling the wrap-around case where data spans the end and beginning of the circular buffer. The buffer lazily initializes its internal storage tensors on the first store() call, matching the shape and device of the incoming data.
Sampling is performed via sample(n), which uses a pre-shuffled index permutation (_sample_idx) to draw n samples without replacement within each full pass through the buffer. When all indices have been exhausted, the permutation is re-shuffled. If the buffer is not yet full (i.e., _total_count < _buffer_size), sampling indices are clamped to the valid range. The reset() method clears the buffer state and re-shuffles the sampling permutation.
Usage
Use ReplayBuffer when implementing AMP or similar algorithms that require replaying past agent experiences through a discriminator or other evaluation network. It is instantiated by the AMP agent during initialization with a specified buffer size and GPU device, and data is stored at each training step.
Code Reference
Source Location
- Repository: IsaacGymEnvs
- File: isaacgymenvs/learning/replay_buffer.py
- Lines: 32-113
Signature
class ReplayBuffer:
def __init__(self, buffer_size, device):
def reset(self):
def get_buffer_size(self):
def get_total_count(self):
def store(self, data_dict):
def sample(self, n):
def _reset_sample_idx(self):
def _init_data_buf(self, data_dict):
Import
from isaacgymenvs.learning.replay_buffer import ReplayBuffer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| buffer_size | int | Yes | Maximum number of entries the buffer can hold |
| device | torch.device | Yes | The GPU device on which buffer tensors are allocated |
| data_dict | dict[str, torch.Tensor] | Yes | Dictionary of named tensors to store (passed to store()); all tensors must have the same batch dimension
|
| n | int | Yes | Number of samples to draw (passed to sample())
|
Outputs
| Name | Type | Description |
|---|---|---|
| samples | dict[str, torch.Tensor] | Dictionary of sampled tensors, matching the keys of the stored data, returned by sample()
|
| buffer_size | int | The maximum capacity of the buffer, returned by get_buffer_size()
|
| total_count | int | The total number of entries that have been stored (may exceed buffer_size due to overwrites), returned by get_total_count()
|
Usage Examples
import torch
from isaacgymenvs.learning.replay_buffer import ReplayBuffer
# Create a replay buffer on GPU with capacity for 100,000 entries
device = torch.device('cuda:0')
buffer = ReplayBuffer(buffer_size=100000, device=device)
# Store AMP observations collected during rollout
amp_obs = torch.randn(256, 64, device=device) # batch of 256, obs dim 64
buffer.store({'amp_obs': amp_obs})
# Sample a mini-batch of 128 observations for discriminator training
samples = buffer.sample(128)
amp_obs_replay = samples['amp_obs'] # shape: (128, 64)
# Check buffer state
print(f"Buffer size: {buffer.get_buffer_size()}")
print(f"Total stored: {buffer.get_total_count()}")
# Reset the buffer (e.g., at the start of a new training run)
buffer.reset()
Related Pages
- Isaac_sim_IsaacGymEnvs_ModelAMPContinuous - Uses replay buffer samples during forward pass for discriminator evaluation
- Isaac_sim_IsaacGymEnvs_AMPBuilder - Builds the discriminator network that processes replayed observations