Implementation:Hpcaitech ColossalAI ExperienceBuffer Base
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement Learning, RLHF, Experience Replay |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Abstract base class defining the interface for experience buffers in the ColossalChat RLHF pipeline.
Description
This module defines the ExperienceBuffer abstract base class that establishes the contract for all experience buffer implementations in ColossalChat. It specifies abstract methods for appending experiences, clearing the buffer, sampling batches, accessing items by index, and providing a collation function. The buffer supports a configurable sample batch size and an optional limit on stored samples, where a limit of zero or less indicates unlimited storage.
Usage
Use this base class when implementing custom experience buffer strategies for PPO or other RLHF algorithms. Concrete implementations such as NaiveExperienceBuffer inherit from this class to provide specific storage and sampling behavior.
Code Reference
Source Location
- Repository: Hpcaitech_ColossalAI
- File: applications/ColossalChat/coati/experience_buffer/base.py
- Lines: 1-43
Signature
class ExperienceBuffer(ABC):
def __init__(self, sample_batch_size: int, limit: int = 0) -> None:
@abstractmethod
def append(self, experience: Experience) -> None:
@abstractmethod
def clear(self) -> None:
@abstractmethod
def sample(self) -> Experience:
@abstractmethod
def __len__(self) -> int:
@abstractmethod
def __getitem__(self, idx: int) -> Any:
@abstractmethod
def collate_fn(self, batch: Any) -> Experience:
Import
from coati.experience_buffer.base import ExperienceBuffer
I/O Contract
Inputs (__init__)
| Name | Type | Required | Description |
|---|---|---|---|
| sample_batch_size | int | Yes | Batch size when sampling from the buffer |
| limit | int | No | Maximum number of stored experience samples; <= 0 means unlimited, defaults to 0 |
Outputs (sample)
| Name | Type | Description |
|---|---|---|
| return | Experience | A batch of sampled experience data |
Usage Examples
from coati.experience_buffer.base import ExperienceBuffer
from coati.experience_maker.base import Experience
# ExperienceBuffer is abstract; use a concrete implementation
from coati.experience_buffer.naive import NaiveExperienceBuffer
buffer = NaiveExperienceBuffer(sample_batch_size=8, limit=1000)
buffer.append(experience)
sampled = buffer.sample()