Implementation:NVIDIA NeMo Aligner Attribute Annotate
Appearance
| Knowledge Sources | |
|---|---|
| Domains | SteerLM, Reward Modeling, Data Annotation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
A script that annotates conversational datasets with SteerLM attribute scores by sending requests to a running regression reward model server via the PyTriton client.
Description
attribute_annotate.py automates the process of annotating conversational data with multi-dimensional attribute scores for SteerLM training. The script operates as follows:
- Input loading: Reads a JSONL file containing conversations in the NeMo chat format (with
system,conversations,maskfields). Skips samples that have already been annotated (supports resume functionality). - Text formatting: For each conversation, formats the text using the NeMo
<extra_id_*>template with System, User, and Assistant turn markers. At each assistant turn that has alabelfield, appends the<extra_id_2>label prefix. - Reward model inference: Sends the formatted text to a regression reward model server (accessed via PyTriton FuturesModelClient). The model returns reward scores for all 9 SteerLM attributes (quality, toxicity, humor, creativity, helpfulness, correctness, coherence, complexity, verbosity).
- Score processing: Clamps predicted scores to the [0.0, 4.0] range, rounds to integers, and formats as a comma-separated string (e.g.,
quality:3,toxicity:0,humor:1,...). - Output writing: Writes annotated samples incrementally to the output file in JSONL format, enabling resume on interruption.
Usage
Use this script when:
- You have a trained regression reward model deployed as a Triton server
- You need to annotate conversations with SteerLM attribute scores
- You are preparing training data for attribute-conditioned SFT
Code Reference
Source Location
- Repository: NVIDIA_NeMo_Aligner
- File:
examples/nlp/data/steerlm/attribute_annotate.py - Lines: 1-150
Signature
get_reward:
def get_reward(
sentences: List[str],
host="localhost",
port=5555,
model_name="reward_model",
):
main:
def main(args):
prepare_args:
def prepare_args():
Import
from attribute_annotate import get_reward, main
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --input-file | str |
Yes | Path to input JSONL file with conversations in NeMo chat format |
| --output-file | str |
Yes | Path to output JSONL file for annotated conversations |
| --port | int |
No | Port of the reward model Triton server (default: 5555) |
| --host | str |
No | Hostname of the reward model Triton server (default: "localhost") |
| --model_name | str |
No | Name of the reward model in Triton (default: "reward_model") |
Outputs
| Name | Type | Description |
|---|---|---|
| output-file | JSONL file | Annotated conversations where each assistant turn's label field contains comma-separated attribute:score pairs (e.g., quality:3,toxicity:0,humor:1,creativity:2,helpfulness:3,correctness:4,coherence:3,complexity:2,verbosity:1)
|
Usage Examples
# Command-line usage:
python attribute_annotate.py \
--input-file /data/conversations.jsonl \
--output-file /data/annotated_conversations.jsonl \
--host localhost \
--port 5555 \
--model_name reward_model
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment