Implementation:NVIDIA NeMo Aligner Generate SL CAI Dataset

Knowledge Sources	NVIDIA_NeMo_Aligner
Domains	Constitutional AI, Supervised Learning, Dataset Generation
Last Updated	2026-02-08 00:00 GMT

Overview

A script that generates SL-based (Supervised Learning) Constitutional AI datasets through a critique-revision pipeline, producing revised responses that are blended with helpfulness data for supervised fine-tuning.

Description

generate_sl_cai_dataset.py implements the Supervised Learning variant of Constitutional AI dataset generation. The pipeline performs the following steps for each batch of red-teaming prompts:

Initial response generation: Given a red-teaming prompt and few-shot examples, the model generates an initial (potentially harmful) response.
Critique generation: A randomly sampled constitutional principle is used to generate a critique of the initial response, identifying specific issues.
Revision generation: Using the critique as context, the model generates a revised response that addresses the identified issues.
Chat prompt formatting: The revised (prompt, revision) pairs are formatted into NeMo chat prompt template format with system, conversations, mask, and dataset fields.
Dataset blending: The CAI-revised samples are blended with a helpfulness dataset (e.g., Anthropic helpful-only) and randomly permuted.
Long dialog removal (optional): Dialogs exceeding a maximum sequence length can be filtered out using tokenization.

The critique-revision instructions are loaded from a JSON file containing pairs of critique prompts and revision requests for each constitutional principle.

Usage

Use this script when:

You need to generate SFT training data using the Constitutional AI critique-revision approach
You want to produce harmlessness training data to blend with helpfulness data
You have a local NeMo inference service for generating responses, critiques, and revisions

Code Reference

Source Location

Repository: NVIDIA_NeMo_Aligner
File: examples/nlp/cai/generate_sl_cai_dataset.py
Lines: 1-579

Signature

generate_cai_dataset:

def generate_cai_dataset(
    red_teaming_dataset_path: str,
    few_shot_samples_filepath: str,
    critique_revision_instructions_filepath: str,
    num_examples: int,
    batch_size: int,
    save_to_file_interval: int,
    save_file_path: str,
    inference_config: dict,
    prompt_template_config: dict,
    apply_chat_template: bool,
):

generate_cai_batch_sample:

def generate_cai_batch_sample(
    prompt_list: list,
    few_shot_dataset_path: str,
    critique_list,
    revision_list,
    inference_config: dict,
    prompt_template_config: dict,
    apply_chat_template: bool,
):

blend_cai_sft_dataset:

def blend_cai_sft_dataset(
    helpfulness_dataset_path,
    cai_samples_filepath,
    blended_sl_cai_filename,
    summary_filename,
):

main:

def main():

Import

from generate_sl_cai_dataset import generate_cai_dataset, generate_cai_batch_sample, blend_cai_sft_dataset

I/O Contract

Inputs

Name	Type	Required	Description
--red-teaming-prompts-dataset-path	`str`	Yes	Path to Anthropic red-teaming prompt dataset (JSONL)
--few-shot-prompts-dataset-path	`str`	Yes	Path to JSON file with few-shot prompt examples
--critique-revision-instructions-path	`str`	Yes	Path to JSON file with constitution critique/revision instruction pairs
--helpfulness-dataset-path	`str`	Yes	Path to helpfulness dataset for blending (e.g., Anthropic helpful-only)
--batch_size	`int`	No	Inference batch size (default: 32)
--num-examples	`int`	No	Number of samples to generate; -1 for all (default: -1)
--output-filepath	`str`	No	Output file path (default: "cai_revisions_aligner_chat_template.jsonl")
--seed	`int`	No	Random seed (default: 1234)
--save-to-file-interval	`int`	No	Save progress every N batches (default: 1)
--max-seq-length	`int`	No	Maximum sequence length for filtering (optional)
--tokenizer-model	`str`	No	Path to tokenizer model (required if max-seq-length is set)
--tokenizer-library	`str`	No	Tokenizer library name (required if max-seq-length is set)
--apply_chat_template	`str`	No	Whether to apply chat template during inference (default: "False")
--host	`str`	No	Inference service hostname (default: "localhost")
--port	`int`	No	Inference service port (default: 5656)

Outputs

Name	Type	Description
output-filepath	JSONL file	Blended SFT dataset with CAI revisions and helpfulness data in chat prompt format
cai_output/cai_critique_revision_samples.json	JSON file	Raw critique/revision samples with full pipeline metadata
cai_output/cai_samples.jsonl	JSONL file	CAI samples in chat prompt format before blending
*.summary.json	JSON file	Summary of the blending operation (input files, output file, template)

Usage Examples

# Command-line usage with extra_id chat template:
python generate_sl_cai_dataset.py \
    --red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
    --few-shot-prompts-dataset-path /data/few_shot_samples.json \
    --critique-revision-instructions-path /data/constitution_instructions.json \
    --helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
    --batch_size 32 \
    --num-examples -1 \
    --output-filepath /output/sl_cai/cai_revisions.jsonl \
    --apply_chat_template True \
    --user_format "<extra_id_1>User\n{MESSAGE}\n<extra_id_1>Assistant\n" \
    --assistant_format "{MESSAGE}\n" \
    --system_format "<extra_id_0>System\n{MESSAGE}\n" \
    --system_default_message "" \
    --eos_token "<extra_id_1>" \
    --response_extract_pattern "<extra_id_1>Assistant\n"

# Command-line usage with Mistral-Instruct template:
python generate_sl_cai_dataset.py \
    --red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
    --few-shot-prompts-dataset-path /data/few_shot_samples.json \
    --critique-revision-instructions-path /data/constitution_instructions.json \
    --helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
    --apply_chat_template False \
    --response_extract_pattern "[/INST]"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment