Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Intel Ipex llm LISA Dynamic Layer Training

From Leeroopedia


Knowledge Sources
Domains Finetuning, Memory_Efficient_Training
Last Updated 2026-02-09 04:00 GMT

Overview

Training technique that dynamically activates different subsets of model layers during training to reduce memory requirements while approximating full fine-tuning.

Description

LISA (Layer-wise Integrated Sensitivity-based Adaptation) selectively trains a small subset of model layers at each training step, rotating which layers are active at configurable intervals. By training different layers at different times, the method achieves coverage of the entire model over the course of training while never requiring gradients for all layers simultaneously. This significantly reduces peak memory usage compared to full fine-tuning.

Usage

Use this principle as an alternative to LoRA when full-parameter quality is needed but memory is limited. LISA provides a middle ground: it updates all original parameters (unlike LoRA which adds new parameters) but does so incrementally to reduce memory.

Theoretical Basis

At each training step, only k out of L total layers have trainable parameters. Every s steps, a new random subset of k layers is selected:

Pseudo-code Logic:

# Abstract LISA algorithm
for step in training:
    if step % lisa_interval_steps == 0:
        active_layers = random_select(all_layers, k=lisa_activated_layers)
        freeze_all_layers(model)
        unfreeze(model, active_layers)
    loss = forward(model, batch)
    loss.backward()  # Gradients only for active layers
    optimizer.step()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment