Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Snorkel team Snorkel Slice Aware Data Preparation

From Leeroopedia
Knowledge Sources
Domains Data_Slicing, Data_Preparation, Multi_Task_Learning
Last Updated 2026-02-14 20:00 GMT

Overview

A data preparation strategy that augments standard datasets with slice-specific indicator and prediction labels required for slice-aware multi-task training.

Description

Slice-Aware Data Preparation converts a standard dataset into one suitable for slice-aware training. For each slice, two additional label sets are created:

  • Indicator labels: Binary labels (0/1) indicating slice membership (from the slice matrix S)
  • Prediction labels: Classification labels masked to -1 for data points not in the slice

This labeling scheme allows the multi-task model to simultaneously learn slice membership detection and slice-specific classification, with the prediction labels only applying within each slice.

Usage

Use this principle after obtaining a slice matrix from SF application and before training a slice-aware model. The prepared dataloaders contain all necessary labels for the multi-task training loop.

Theoretical Basis

For a base task with labels Y and slice sj with indicator matrix S:

Indicator labels: Yjind=S:,j{0,1}n

Prediction labels: Yj,ipred={Yiif Si,j=11if Si,j=0

The -1 entries are masked during loss computation so the predictor head only trains on in-slice examples.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment