Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Protectai Modelscan Middleware Pipeline

From Leeroopedia
Knowledge Sources
Domains ML_Security, Software_Architecture
Last Updated 2026-02-14 12:00 GMT

Overview

A chain-of-responsibility preprocessing pipeline that enriches model file metadata before scanners process them, enabling format detection and extensible pre-scan transformations.

Description

The Middleware Pipeline principle provides a preprocessing layer between file iteration and scanner dispatch. Before a model file reaches any scanner, it passes through a chain of middleware functions that can inspect and annotate the file with metadata. The primary use case is format detection: the middleware examines the file extension and tags the Model object with its format type (e.g., PICKLE, PYTORCH, TENSORFLOW), so scanners can quickly determine whether they should process the file.

The pipeline follows the chain-of-responsibility pattern: each middleware receives the model and a call_next function. It can modify the model's context, then call call_next to pass control to the next middleware in the chain. This design allows:

  • Sequential processing: Middleware executes in registration order
  • Short-circuiting: A middleware can choose not to call call_next
  • Composability: New middleware can be added without modifying existing ones
  • Dynamic loading: Middleware classes are loaded via importlib from settings

Usage

Apply this principle when:

  • Understanding how modelscan determines which scanner should handle which file
  • Adding a new file format that needs a format tag before scanning
  • Implementing custom preprocessing (e.g., content-based format detection, metadata extraction)
  • Configuring the middleware pipeline in settings

Theoretical Basis

The middleware pipeline implements a recursive chain-of-responsibility:

# Pseudo-code for middleware pipeline execution
def run(model, index=0):
    if index < len(middlewares):
        middlewares[index](model, lambda m: run(m, index + 1))

The FormatViaExtensionMiddleware performs extension-based format tagging:

# Pseudo-code for format detection
def __call__(self, model, call_next):
    extension = model.get_source().suffix  # e.g., ".pkl"
    formats = [fmt for fmt, exts in format_map.items() if extension in exts]
    if formats:
        model.set_context("formats", formats)
    call_next(model)

Scanners then check the format context to decide if they should process the file:

# Scanner format checking pattern
formats = model.get_context("formats") or []
if MY_FORMAT not in [f.value for f in formats]:
    return None  # Not my format

This decouples format detection from scanning logic, allowing either to be modified independently.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment