Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Norrrrrrr lyn WAInjectBench Ensemble Aggregation Text

From Leeroopedia
Knowledge Sources
Domains Ensemble_Learning, NLP, Security
Last Updated 2026-02-14 16:00 GMT

Overview

A union-based ensemble strategy that combines text detection results from multiple detectors by merging their flagged IDs to maximize recall.

Description

Ensemble Aggregation in the text detection pipeline takes the results from all individual text detectors and combines them using set union. If any detector flags a sample, the ensemble marks it as detected. This approach maximizes the True Positive Rate at the potential cost of increased False Positive Rate — a reasonable trade-off in security applications where missing a genuine attack is more costly than a false alarm.

The ensemble reads all per-detector JSONL result files from the result directory, groups results by dataset name, unions the detect_ids sets, and recomputes TPR/FPR for the combined detection.

Usage

Use this as the final aggregation step after all individual text detectors have been run. The ensemble requires that individual detector results already exist as JSONL files in the result directory.

Theoretical Basis

Union ensemble rule:

Densemble=D1D2Dn

Where Di is the set of IDs flagged by detector i.

# Union ensemble algorithm
for each detector_result_file:
    for each entry:
        ensemble[data_name].detect_ids |= entry.detect_ids
rate = len(ensemble[data_name].detect_ids) / total_num

This guarantees that TPRensemblemax(TPRi) — the ensemble recall is at least as good as the best individual detector.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment