Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:OpenBMB UltraFeedback Result Collection

From Leeroopedia
Revision as of 17:12, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/OpenBMB_UltraFeedback_Result_Collection.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains NLP, Data_Construction
Last Updated 2023-10-02 00:00 GMT

Overview

A completion aggregation and persistence strategy that collects model-generated responses and stores them in a structured JSON format for downstream annotation.

Description

Result Collection is the final step of the completion generation phase. After each model generates a response to an instruction, the result is appended to the instruction's completions array as a structured dictionary containing the model identifier, the principle category, the system prompt used, and the generated text.

The pipeline writes results back to the same JSON file it read from (in-place update), allowing incremental accumulation of completions across multiple generation passes with different models. Each completion entry preserves full provenance: which model generated it, which principle guided it, and the exact system prompt used.

Usage

Use this principle when building data generation pipelines that accumulate completions from multiple sources. The in-place JSON update pattern allows running the pipeline separately for each model while building up a complete dataset.

Theoretical Basis

The storage schema follows a nested document design where each instruction is the top-level record and completions are nested arrays. This is more natural than a flat table design because the number of completions per instruction varies.

Pseudo-code Logic:

# Abstract algorithm
for each (instruction, model, principle, response):
    instruction["completions"].append({
        "model": model_type,
        "principle": principle_category,
        "custom_system_prompt": principle_prompt_text,
        "response": generated_text
    })

# Persist to disk
json.dump(dataset, file, indent=4)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment