Implementation:Hpcaitech ColossalAI AccumulativeMeanMeter

Knowledge Sources	Hpcaitech_ColossalAI
Domains	Training Utilities, Metrics, RLHF
Last Updated	2026-02-09 00:00 GMT

Overview

Accumulative mean meter utility for tracking running averages of named metrics during training.

Description

This module provides two classes for computing running averages. AccumulativeMeanVariable is a single-variable tracker that maintains a cumulative sum and count, computing the mean on demand. AccumulativeMeanMeter wraps a dictionary of AccumulativeMeanVariable instances, allowing you to track multiple named metrics simultaneously. Both classes support adding values with custom count increments and resetting all tracked statistics.

Usage

Use this utility during ColossalChat training loops to track metrics such as loss, reward, KL divergence, and other scalar values across batches and epochs. It provides a simple API for accumulating and retrieving running averages.

Code Reference

Source Location

Repository: Hpcaitech_ColossalAI
File: applications/ColossalChat/coati/utils/accumulative_meter.py
Lines: 1-69

Signature

class AccumulativeMeanVariable:
    def __init__(self):
    def add(self, value, count_update=1):
    def get(self):
    def reset(self):

class AccumulativeMeanMeter:
    def __init__(self):
    def add(self, name, value, count_update=1):
    def get(self, name):
    def reset(self):

Import

from coati.utils.accumulative_meter import AccumulativeMeanMeter, AccumulativeMeanVariable

I/O Contract

Inputs (AccumulativeMeanMeter.add)

Name	Type	Required	Description
name	str	Yes	Name of the metric to track
value	float	Yes	Value to add to the accumulator
count_update	int	No	Number of samples this value represents, defaults to 1

Outputs (AccumulativeMeanMeter.get)

Name	Type	Description
return	float	The current accumulative mean of the named metric; 0 if count is 0

Usage Examples

from coati.utils.accumulative_meter import AccumulativeMeanMeter

meter = AccumulativeMeanMeter()

# Track loss across batches
for batch in dataloader:
    loss = train_step(batch)
    meter.add("loss", loss.item())
    meter.add("reward", reward.item())

# Retrieve running averages
avg_loss = meter.get("loss")
avg_reward = meter.get("reward")
print(f"Average loss: {avg_loss}, Average reward: {avg_reward}")

# Reset at epoch end
meter.reset()

Related Pages

Environment:Hpcaitech_ColossalAI_CUDA_GPU_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment