Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL TeacherWorker Forward

From Leeroopedia


Knowledge Sources
Domains Knowledge_Distillation, LLM_Inference
Last Updated 2026-02-07 20:00 GMT

Overview

Concrete teacher forward pass with top-k logit extraction provided by the Alibaba ROLL library.

Description

The TeacherWorker.forward method runs inference through the teacher model and extracts top-k probabilities, log-probabilities, and indices. The results are cached in the student's LogitsCache via the LogitsTransferGroup.

Usage

Called before each student training step.

Code Reference

Source Location

  • Repository: Alibaba ROLL
  • File: roll/pipeline/distill/distill_worker.py
  • Lines: L477-584

Signature

class TeacherWorker(Worker):
    @register(dispatch_mode=Dispatch.DP_MP_DISPATCH_FIRST_COLLECT_ALL, clear_cache=False)
    def forward(self, data: DataProto) -> DataProto:
        """
        Teacher forward pass with top-k logit extraction.

        Args:
            data: DataProto with input_ids, attention_mask, labels

        Returns:
            DataProto (logits cached in student via LogitsTransferGroup)
        """

    def logits_transfer(self, tensor_name_for_transfer, model_update_name,
                       broadcast_comm_plan_args, p2p_tgt_workers, p2p_entry_list, backend):
        """Transfer teacher logits to student workers."""

Import

from roll.pipeline.distill.distill_worker import TeacherWorker

I/O Contract

Inputs

Name Type Required Description
data DataProto Yes Batch with input_ids, attention_mask, labels

Outputs

Name Type Description
topk_probs torch.Tensor Top-k teacher probabilities
topk_log_probs torch.Tensor Top-k teacher log probabilities
topk_indices torch.Tensor Top-k vocabulary indices
topk_inf_mask torch.Tensor Mask for infinite values

Usage Examples

# Called by the distillation pipeline:
teacher_cluster.execute_all_sync("forward", batch)
# Logits are automatically transferred to student's LogitsCache
logits_transfer_group.logits_transfer()

Related Pages

Implements Principle

Requires Environment

Environment Dependencies

This implementation requires the following environment constraints:

Heuristics Applied

This implementation uses the following heuristics:

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment