Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hpcaitech ColossalAI ExperienceBuffer Base

From Leeroopedia


Knowledge Sources
Domains Reinforcement Learning, RLHF, Experience Replay
Last Updated 2026-02-09 00:00 GMT

Overview

Abstract base class defining the interface for experience buffers in the ColossalChat RLHF pipeline.

Description

This module defines the ExperienceBuffer abstract base class that establishes the contract for all experience buffer implementations in ColossalChat. It specifies abstract methods for appending experiences, clearing the buffer, sampling batches, accessing items by index, and providing a collation function. The buffer supports a configurable sample batch size and an optional limit on stored samples, where a limit of zero or less indicates unlimited storage.

Usage

Use this base class when implementing custom experience buffer strategies for PPO or other RLHF algorithms. Concrete implementations such as NaiveExperienceBuffer inherit from this class to provide specific storage and sampling behavior.

Code Reference

Source Location

Signature

class ExperienceBuffer(ABC):
    def __init__(self, sample_batch_size: int, limit: int = 0) -> None:

    @abstractmethod
    def append(self, experience: Experience) -> None:

    @abstractmethod
    def clear(self) -> None:

    @abstractmethod
    def sample(self) -> Experience:

    @abstractmethod
    def __len__(self) -> int:

    @abstractmethod
    def __getitem__(self, idx: int) -> Any:

    @abstractmethod
    def collate_fn(self, batch: Any) -> Experience:

Import

from coati.experience_buffer.base import ExperienceBuffer

I/O Contract

Inputs (__init__)

Name Type Required Description
sample_batch_size int Yes Batch size when sampling from the buffer
limit int No Maximum number of stored experience samples; <= 0 means unlimited, defaults to 0

Outputs (sample)

Name Type Description
return Experience A batch of sampled experience data

Usage Examples

from coati.experience_buffer.base import ExperienceBuffer
from coati.experience_maker.base import Experience

# ExperienceBuffer is abstract; use a concrete implementation
from coati.experience_buffer.naive import NaiveExperienceBuffer

buffer = NaiveExperienceBuffer(sample_batch_size=8, limit=1000)
buffer.append(experience)
sampled = buffer.sample()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment