Implementation:NVIDIA NeMo Aligner Generate SL CAI Dataset
| Knowledge Sources | |
|---|---|
| Domains | Constitutional AI, Supervised Learning, Dataset Generation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
A script that generates SL-based (Supervised Learning) Constitutional AI datasets through a critique-revision pipeline, producing revised responses that are blended with helpfulness data for supervised fine-tuning.
Description
generate_sl_cai_dataset.py implements the Supervised Learning variant of Constitutional AI dataset generation. The pipeline performs the following steps for each batch of red-teaming prompts:
- Initial response generation: Given a red-teaming prompt and few-shot examples, the model generates an initial (potentially harmful) response.
- Critique generation: A randomly sampled constitutional principle is used to generate a critique of the initial response, identifying specific issues.
- Revision generation: Using the critique as context, the model generates a revised response that addresses the identified issues.
- Chat prompt formatting: The revised (prompt, revision) pairs are formatted into NeMo chat prompt template format with
system,conversations,mask, anddatasetfields. - Dataset blending: The CAI-revised samples are blended with a helpfulness dataset (e.g., Anthropic helpful-only) and randomly permuted.
- Long dialog removal (optional): Dialogs exceeding a maximum sequence length can be filtered out using tokenization.
The critique-revision instructions are loaded from a JSON file containing pairs of critique prompts and revision requests for each constitutional principle.
Usage
Use this script when:
- You need to generate SFT training data using the Constitutional AI critique-revision approach
- You want to produce harmlessness training data to blend with helpfulness data
- You have a local NeMo inference service for generating responses, critiques, and revisions
Code Reference
Source Location
- Repository: NVIDIA_NeMo_Aligner
- File:
examples/nlp/cai/generate_sl_cai_dataset.py - Lines: 1-579
Signature
generate_cai_dataset:
def generate_cai_dataset(
red_teaming_dataset_path: str,
few_shot_samples_filepath: str,
critique_revision_instructions_filepath: str,
num_examples: int,
batch_size: int,
save_to_file_interval: int,
save_file_path: str,
inference_config: dict,
prompt_template_config: dict,
apply_chat_template: bool,
):
generate_cai_batch_sample:
def generate_cai_batch_sample(
prompt_list: list,
few_shot_dataset_path: str,
critique_list,
revision_list,
inference_config: dict,
prompt_template_config: dict,
apply_chat_template: bool,
):
blend_cai_sft_dataset:
def blend_cai_sft_dataset(
helpfulness_dataset_path,
cai_samples_filepath,
blended_sl_cai_filename,
summary_filename,
):
main:
def main():
Import
from generate_sl_cai_dataset import generate_cai_dataset, generate_cai_batch_sample, blend_cai_sft_dataset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --red-teaming-prompts-dataset-path | str |
Yes | Path to Anthropic red-teaming prompt dataset (JSONL) |
| --few-shot-prompts-dataset-path | str |
Yes | Path to JSON file with few-shot prompt examples |
| --critique-revision-instructions-path | str |
Yes | Path to JSON file with constitution critique/revision instruction pairs |
| --helpfulness-dataset-path | str |
Yes | Path to helpfulness dataset for blending (e.g., Anthropic helpful-only) |
| --batch_size | int |
No | Inference batch size (default: 32) |
| --num-examples | int |
No | Number of samples to generate; -1 for all (default: -1) |
| --output-filepath | str |
No | Output file path (default: "cai_revisions_aligner_chat_template.jsonl") |
| --seed | int |
No | Random seed (default: 1234) |
| --save-to-file-interval | int |
No | Save progress every N batches (default: 1) |
| --max-seq-length | int |
No | Maximum sequence length for filtering (optional) |
| --tokenizer-model | str |
No | Path to tokenizer model (required if max-seq-length is set) |
| --tokenizer-library | str |
No | Tokenizer library name (required if max-seq-length is set) |
| --apply_chat_template | str |
No | Whether to apply chat template during inference (default: "False") |
| --host | str |
No | Inference service hostname (default: "localhost") |
| --port | int |
No | Inference service port (default: 5656) |
Outputs
| Name | Type | Description |
|---|---|---|
| output-filepath | JSONL file | Blended SFT dataset with CAI revisions and helpfulness data in chat prompt format |
| cai_output/cai_critique_revision_samples.json | JSON file | Raw critique/revision samples with full pipeline metadata |
| cai_output/cai_samples.jsonl | JSONL file | CAI samples in chat prompt format before blending |
| *.summary.json | JSON file | Summary of the blending operation (input files, output file, template) |
Usage Examples
# Command-line usage with extra_id chat template:
python generate_sl_cai_dataset.py \
--red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
--few-shot-prompts-dataset-path /data/few_shot_samples.json \
--critique-revision-instructions-path /data/constitution_instructions.json \
--helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
--batch_size 32 \
--num-examples -1 \
--output-filepath /output/sl_cai/cai_revisions.jsonl \
--apply_chat_template True \
--user_format "<extra_id_1>User\n{MESSAGE}\n<extra_id_1>Assistant\n" \
--assistant_format "{MESSAGE}\n" \
--system_format "<extra_id_0>System\n{MESSAGE}\n" \
--system_default_message "" \
--eos_token "<extra_id_1>" \
--response_extract_pattern "<extra_id_1>Assistant\n"
# Command-line usage with Mistral-Instruct template:
python generate_sl_cai_dataset.py \
--red-teaming-prompts-dataset-path /data/red_team_attempts.jsonl \
--few-shot-prompts-dataset-path /data/few_shot_samples.json \
--critique-revision-instructions-path /data/constitution_instructions.json \
--helpfulness-dataset-path /data/anthropic_helpful_only.jsonl \
--apply_chat_template False \
--response_extract_pattern "[/INST]"