Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:OpenBMB UltraFeedback Principle Distribution Tuning

From Leeroopedia
Revision as of 10:43, 16 February 2026 by Admin (talk | contribs) (Auto-imported from heuristics/OpenBMB_UltraFeedback_Principle_Distribution_Tuning.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)




Knowledge Sources
Domains LLMs, Data_Quality, Annotation
Last Updated 2026-02-08 06:00 GMT

Overview

Subset-conditional principle distribution strategy using weighted random sampling with a 10% honesty-to-verbalized-calibration switch.

Description

The UltraFeedback dataset uses different principle distributions for different instruction subsets, reflecting each subset's natural characteristics. Helpfulness dominates most subsets (60-100%), while truthfulness, honesty, and verbalized calibration are mixed in at lower rates depending on the source dataset. A notable implementation detail is the secondary stochastic switch: after selecting "honesty" as the initial principle, there is a 10% chance (`np.random.rand() < 0.9` check, with the else branch) that it gets replaced with "verbalized_calibration". This creates a controlled injection of calibration-style responses without dedicating a full sampling slot to it.

Usage

Use this heuristic when designing diverse preference datasets that need to balance multiple behavioral dimensions. The principle distribution should match the natural characteristics of each instruction source (e.g., factual QA datasets get more truthfulness principles; creative tasks get more helpfulness). The stochastic switch pattern is useful for injecting rare but important behaviors without consuming a dedicated slot in the sampling distribution.

The Insight (Rule of Thumb)

  • Action: Configure per-subset principle distributions using weighted `random.choice()` lists. Add a secondary stochastic switch for rare principles.
  • Value: Distribution ratios by subset:
    • ShareGPT/UltraChat: 3:1:1 (helpfulness:truthfulness:honesty), then 10% of honesty becomes verbalized_calibration = 60% helpful, 20% truthful, 18% honest, 2% calibrated
    • FLAN: 4:1 (helpfulness:verbalized_calibration) = 80% helpful, 20% calibrated
    • Evol-Instruct: 100% helpfulness
    • TruthfulQA/FalseQA: 1:1 (honesty:truthfulness), then 10% honesty switch = 45% honest, 50% truthful, 5% calibrated
  • Trade-off: Non-uniform distributions mean some principle-model combinations have fewer examples. Verbalized calibration at 2% in most subsets may be underrepresented for training.

Reasoning

Different instruction sources have different natural affinities for behavioral principles. Evol-Instruct focuses on complex task completion where helpfulness is the primary concern. TruthfulQA/FalseQA are specifically designed to test factual accuracy, making truthfulness and honesty the relevant principles. The 10% stochastic switch from honesty to verbalized_calibration is an elegant way to introduce numeric confidence expressions (e.g., "Confidence: 80%") without dedicating a full sampling slot, keeping the calibration data naturally distributed across subsets that already test honesty.

Code Evidence

Subset-conditional distribution from `main.py:163-175`:

if subset in ["sharegpt"]:
    principle = random.choice(["helpfulness", "helpfulness", "helpfulness", "truthfulness", "honesty"])
elif subset in ["ultrachat"]:
    principle = random.choice(["helpfulness", "helpfulness", "helpfulness", "truthfulness", "honesty"])
elif subset in ["flan"]:
    principle = random.choice(["helpfulness", "helpfulness", "helpfulness", "helpfulness", "verbalized_calibration"])
elif subset in ["evol_instruct"]:
    principle = "helpfulness"
elif subset in ["truthful_qa", "false_qa"]:
    principle = random.choice(["honesty", "truthfulness"])
else:
    print(subset)
    principle = "helpfulness"

10% honesty-to-calibration switch from `main.py:177-178`:

if principle == "honesty":
    principle = "honesty" if np.random.rand() < 0.9 else "verbalized_calibration"

README documentation of intended distributions:

| ShareGPT   | 60% Helpful, 20% Truthful, 18% Honesty, 2% Verbalized Calibration |
| UltraChat  | 60% Helpful, 20% Truthful, 18% Honesty, 2% Verbalized Calibration |
| FLAN       | 60% Helpful, 20% Truthful, 20% Verbalized Calibration             |

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment