Principle:Alibaba ROLL Distillation Validation
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Knowledge_Distillation, Evaluation |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
An evaluation principle for monitoring knowledge distillation progress by computing student validation loss (SFT only, no distillation loss).
Description
Distillation Validation evaluates the student model on held-out data using SFT loss only (without teacher logits). This measures the student's standalone language modeling quality independently of the teacher.
Usage
Use at configured intervals during distillation training.
Theoretical Basis
Validation uses SFT loss only to assess standalone student quality without teacher dependency.
Related Pages
Implemented By
Related Heuristics
No specific heuristics inform this principle.
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment