Implementation:Isaac sim IsaacGymEnvs ReplayBuffer

Knowledge Sources	IsaacGymEnvs
Domains	Reinforcement_Learning, Data_Management
Last Updated	2026-02-15 11:00 GMT

Overview

ReplayBuffer is a GPU-resident circular replay buffer designed for efficient storage and sampling of observation data during AMP (Adversarial Motion Priors) training.

Description

The ReplayBuffer class provides a fixed-size circular buffer that stores tensors directly on the GPU device, avoiding costly CPU-GPU data transfers during training. It is primarily used in AMP training to maintain a history of agent-generated AMP observations, which are later replayed through the discriminator alongside current observations and demonstration data.

The buffer uses a head pointer (_head) that advances as new data is stored, wrapping around when it reaches the end of the buffer. The store(data_dict) method accepts a dictionary of tensors and writes them into the buffer, handling the wrap-around case where data spans the end and beginning of the circular buffer. The buffer lazily initializes its internal storage tensors on the first store() call, matching the shape and device of the incoming data.

Sampling is performed via sample(n), which uses a pre-shuffled index permutation (_sample_idx) to draw n samples without replacement within each full pass through the buffer. When all indices have been exhausted, the permutation is re-shuffled. If the buffer is not yet full (i.e., _total_count < _buffer_size), sampling indices are clamped to the valid range. The reset() method clears the buffer state and re-shuffles the sampling permutation.

Usage

Use ReplayBuffer when implementing AMP or similar algorithms that require replaying past agent experiences through a discriminator or other evaluation network. It is instantiated by the AMP agent during initialization with a specified buffer size and GPU device, and data is stored at each training step.

Code Reference

Source Location

Repository: IsaacGymEnvs
File: isaacgymenvs/learning/replay_buffer.py
Lines: 32-113

Signature

class ReplayBuffer:
    def __init__(self, buffer_size, device):
    def reset(self):
    def get_buffer_size(self):
    def get_total_count(self):
    def store(self, data_dict):
    def sample(self, n):
    def _reset_sample_idx(self):
    def _init_data_buf(self, data_dict):

Import

from isaacgymenvs.learning.replay_buffer import ReplayBuffer

I/O Contract

Inputs

Name	Type	Required	Description
buffer_size	int	Yes	Maximum number of entries the buffer can hold
device	torch.device	Yes	The GPU device on which buffer tensors are allocated
data_dict	dict[str, torch.Tensor]	Yes	Dictionary of named tensors to store (passed to `store()`); all tensors must have the same batch dimension
n	int	Yes	Number of samples to draw (passed to `sample()`)

Outputs

Name	Type	Description
samples	dict[str, torch.Tensor]	Dictionary of sampled tensors, matching the keys of the stored data, returned by `sample()`
buffer_size	int	The maximum capacity of the buffer, returned by `get_buffer_size()`
total_count	int	The total number of entries that have been stored (may exceed buffer_size due to overwrites), returned by `get_total_count()`

Usage Examples

import torch
from isaacgymenvs.learning.replay_buffer import ReplayBuffer

# Create a replay buffer on GPU with capacity for 100,000 entries
device = torch.device('cuda:0')
buffer = ReplayBuffer(buffer_size=100000, device=device)

# Store AMP observations collected during rollout
amp_obs = torch.randn(256, 64, device=device)  # batch of 256, obs dim 64
buffer.store({'amp_obs': amp_obs})

# Sample a mini-batch of 128 observations for discriminator training
samples = buffer.sample(128)
amp_obs_replay = samples['amp_obs']  # shape: (128, 64)

# Check buffer state
print(f"Buffer size: {buffer.get_buffer_size()}")
print(f"Total stored: {buffer.get_total_count()}")

# Reset the buffer (e.g., at the start of a new training run)
buffer.reset()

Related Pages

Isaac_sim_IsaacGymEnvs_ModelAMPContinuous - Uses replay buffer samples during forward pass for discriminator evaluation
Isaac_sim_IsaacGymEnvs_AMPBuilder - Builds the discriminator network that processes replayed observations

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment