Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hpcaitech ColossalAI Chat Conversation

From Leeroopedia


Knowledge Sources
Domains Natural Language Processing, Chat Template, RLHF
Last Updated 2026-02-09 00:00 GMT

Overview

Conversation template management for ColossalChat that handles chat formatting and message history.

Description

This module provides the Conversation dataclass and the setup_conversation_template factory function used throughout the ColossalChat RLHF training pipeline. The Conversation class wraps a tokenizer and its chat template configuration, managing system messages, message history, and prompt generation via the HuggingFace apply_chat_template method. The setup_conversation_template function handles loading chat templates from configuration dictionaries, tokenizer defaults, or external model paths, with optional saving of the resolved configuration to disk.

Usage

Use this module when setting up conversation templates for ColossalChat training or inference pipelines. It is essential for ensuring consistent prompt formatting across SFT, reward model training, and PPO/RLHF stages.

Code Reference

Source Location

Signature

@dataclasses.dataclass
class Conversation:
    tokenizer: PreTrainedTokenizer
    system_message: str
    chat_template: str
    stop_ids: List[int]
    end_of_assistant: str
    roles = ["user", "assistant"]

    @classmethod
    def from_config(cls, tokenizer: PreTrainedTokenizer, config: Dict):
    def clear(self):
    def get_prompt(self, length: int = None, add_generation_prompt=False) -> Any:
    def append_message(self, role: str, message: str):
    def copy(self):

def setup_conversation_template(
    tokenizer: PreTrainedTokenizer, chat_template_config: Dict = None, save_path: str = None
) -> Conversation:

Import

from coati.dataset.conversation import Conversation, setup_conversation_template

I/O Contract

Inputs (setup_conversation_template)

Name Type Required Description
tokenizer PreTrainedTokenizer Yes The tokenizer to use for chat template application
chat_template_config Dict No Configuration dict with keys: system_message, chat_template, stop_ids, end_of_assistant
save_path str No Optional path to save the resolved conversation template config

Outputs

Name Type Description
return Conversation A configured Conversation instance ready for prompt generation

Usage Examples

from coati.dataset.conversation import Conversation, setup_conversation_template
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")

# Setup from config
config = {
    "system_message": "You are a helpful assistant.",
    "chat_template": tokenizer.chat_template,
    "stop_ids": [2],
    "end_of_assistant": "</s>",
}
conv = setup_conversation_template(tokenizer, chat_template_config=config)

# Build a conversation
conv.append_message("user", "What is RLHF?")
conv.append_message("assistant", "RLHF stands for Reinforcement Learning from Human Feedback.")
prompt = conv.get_prompt(add_generation_prompt=True)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment