Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:NVIDIA NeMo Aligner Generate SL CAI Dataset

From Leeroopedia


Knowledge Sources
Domains Constitutional AI, Supervised Learning, Dataset Generation
Last Updated 2026-02-08 00:00 GMT

Overview

A script that generates SL-based (Supervised Learning) Constitutional AI datasets through a critique-revision pipeline, producing revised responses that are blended with helpfulness data for supervised fine-tuning.

Description

generate_sl_cai_dataset.py implements the Supervised Learning variant of Constitutional AI dataset generation. The pipeline performs the following steps for each batch of red-teaming prompts:

  1. Initial response generation: Given a red-teaming prompt and few-shot examples, the model generates an initial (potentially harmful) response.
  2. Critique generation: A randomly sampled constitutional principle is used to generate a critique of the initial response, identifying specific issues.
  3. Revision generation: Using the critique as context, the model generates a revised response that addresses the identified issues.
  4. Chat prompt formatting: The revised (prompt, revision) pairs are formatted into NeMo chat prompt template format with system, conversations, mask, and dataset fields.
  5. Dataset blending: The CAI-revised samples are blended with a helpfulness dataset (e.g., Anthropic helpful-only) and randomly permuted.
  6. Long dialog removal (optional): Dialogs exceeding a maximum sequence length can be filtered out using tokenization.

The critique-revision instructions are loaded from a JSON file containing pairs of critique prompts and revision requests for each constitutional principle.

Usage

Use this script when:

  • You need to generate SFT training data using the Constitutional AI critique-revision approach
  • You want to produce harmlessness training data to blend with helpfulness data
  • You have a local NeMo inference service for generating responses, critiques, and revisions

Code Reference

Source Location

Signature

generate_cai_dataset:

def generate_cai_dataset(
    red_teaming_dataset_path: str,
    few_shot_samples_filepath: str,
    critique_revision_instructions_filepath: str,
    num_examples: int,
    batch_size: int,
    save_to_file_interval: int,
    save_file_path: str,
    inference_config: dict,
    prompt_template_config: dict,
    apply_chat_template: bool,
):

generate_cai_batch_sample:

def generate_cai_batch_sample(
    prompt_list: list,
    few_shot_dataset_path: str,
    critique_list,
    revision_list,
    inference_config: dict,
    prompt_template_config: dict,
    apply_chat_template: bool,
):

blend_cai_sft_dataset:

def blend_cai_sft_dataset(
    helpfulness_dataset_path,
    cai_samples_filepath,
    blended_sl_cai_filename,
    summary_filename,
):

main:

def main():

Import

from generate_sl_cai_dataset import generate_cai_dataset, generate_cai_batch_sample, blend_cai_sft_dataset

I/O Contract

Inputs

Name Type Required Description
--red-teaming-prompts-dataset-path str Yes Path to Anthropic red-teaming prompt dataset (JSONL)
--few-shot-prompts-dataset-path str Yes Path to JSON file with few-shot prompt examples
--critique-revision-instructions-path str Yes Path to JSON file with constitution critique/revision instruction pairs
--helpfulness-dataset-path str Yes Path to helpfulness dataset for blending (e.g., Anthropic helpful-only)
--batch_size int No Inference batch size (default: 32)
--num-examples int No Number of samples to generate; -1 for all (default: -1)
--output-filepath str No Output file path (default: "cai_revisions_aligner_chat_template.jsonl")
--seed int No Random seed (default: 1234)
--save-to-file-interval int No Save progress every N batches (default: 1)
--max-seq-length int No Maximum sequence length for filtering (optional)
--tokenizer-model str No Path to tokenizer model (required if max-seq-length is set)
--tokenizer-library str No Tokenizer library name (required if max-seq-length is set)
--apply_chat_template str No Whether to apply chat template during inference (default: "False")
--host str No Inference service hostname (default: "localhost")
--port int No Inference service port (default: 5656)

Outputs

Name Type Description
output-filepath JSONL file Blended SFT dataset with CAI revisions and helpfulness data in chat prompt format
cai_output/cai_critique_revision_samples.json JSON file Raw critique/revision samples with full pipeline metadata
cai_output/cai_samples.jsonl JSONL file CAI samples in chat prompt format before blending
*.summary.json JSON file Summary of the blending operation (input files, output file, template)

Usage Examples

# Command-line usage with extra_id chat template:
python generate_sl_cai_dataset.py \
    --red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
    --few-shot-prompts-dataset-path /data/few_shot_samples.json \
    --critique-revision-instructions-path /data/constitution_instructions.json \
    --helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
    --batch_size 32 \
    --num-examples -1 \
    --output-filepath /output/sl_cai/cai_revisions.jsonl \
    --apply_chat_template True \
    --user_format "<extra_id_1>User\n{MESSAGE}\n<extra_id_1>Assistant\n" \
    --assistant_format "{MESSAGE}\n" \
    --system_format "<extra_id_0>System\n{MESSAGE}\n" \
    --system_default_message "" \
    --eos_token "<extra_id_1>" \
    --response_extract_pattern "<extra_id_1>Assistant\n"

# Command-line usage with Mistral-Instruct template:
python generate_sl_cai_dataset.py \
    --red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
    --few-shot-prompts-dataset-path /data/few_shot_samples.json \
    --critique-revision-instructions-path /data/constitution_instructions.json \
    --helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
    --apply_chat_template False \
    --response_extract_pattern "[/INST]"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment