Heuristic:Protectai Modelscan Graceful Scanner Degradation
| Knowledge Sources | |
|---|---|
| Domains | Security, Debugging, Optimization |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
ModelScan scanners degrade gracefully when dependencies are missing or errors occur — they return structured errors rather than crashing, allowing other scanners to continue.
Description
ModelScan's architecture follows a "partial success is better than total failure" philosophy. The system has three layers of graceful degradation:
1. Scanner import failures: If a scanner class cannot be imported (e.g., module not found), the error is logged and the scanner is skipped. Other scanners continue to load normally.
2. Missing optional dependencies: Scanners that require optional libraries (TensorFlow, h5py) check for the dependency before scanning. If missing, they return a `DependencyError` result — not an exception — so the scan pipeline continues.
3. Runtime scan errors: If a scanner throws an unexpected exception during scanning, the error is caught, logged, and stored. The next scanner in the list processes the same file.
This design is critical for a security tool that must scan diverse model formats: a user scanning a directory of mixed models should get results for all formats they have dependencies for, not a crash because one format's dependency is missing.
Usage
When integrating ModelScan into CI/CD pipelines, check both `modelscan.errors` and `modelscan.scanned` after a scan. Non-empty errors with non-empty scanned files means partial success — some files were scanned, but issues were encountered with others (typically due to missing dependencies).
The Insight (Rule of Thumb)
- Action: Design scanners to return `None` (format mismatch), `ScanResults` with errors (dependency missing), or `ScanResults` with issues (findings). Never raise exceptions for expected conditions.
- Value: Three-tier degradation: import → dependency check → runtime error handling.
- Trade-off: Scan results may be incomplete if dependencies are missing, but the user always gets results for the formats they can scan. Errors are always visible in the output.
Reasoning
Tier 1 — Scanner import failure from `modelscan.py:57-78`:
def _load_scanners(self) -> None:
for scanner_path, scanner_settings in self._settings["scanners"].items():
if "enabled" in scanner_settings.keys() and self._settings["scanners"][scanner_path]["enabled"]:
try:
(modulename, classname) = scanner_path.rsplit(".", 1)
imported_module = importlib.import_module(name=modulename, package=classname)
scanner_class: ScanBase = getattr(imported_module, classname)
self._scanners_to_run.append(scanner_class)
except Exception as e:
logger.error("Error importing scanner %s", scanner_path)
self._init_errors.append(
ModelScanError(f"Error importing scanner {scanner_path}: {e}")
)
Tier 2 — Dependency check pattern (same pattern in h5, keras, saved_model scanners):
dep_error = self.handle_binary_dependencies()
if dep_error:
return ScanResults([], [DependencyError(...)], [])
Tier 3 — Runtime error handling from `modelscan.py:173-189`:
try:
scan_results = scanner.scan(model)
except Exception as e:
logger.error(
"Error encountered from scanner %s with path %s: %s",
scanner.full_name(), str(model.get_source()), e,
)
self._errors.append(ModelScanScannerError(scanner.full_name(), str(e), model))
continue
Partial pickle parsing from `tools/picklescanner.py:64-68` extends this philosophy to the bytecode level — if some pickles in a multi-pickle file parse successfully before an error, the already-extracted globals are returned:
# Given we can have multiple pickles in a file, we may have already successfully
# extracted globals from a valid pickle.
# Thus return the already found globals in the error & let the caller decide what to do.
globals_opt = globals if len(globals) > 0 else None
raise GenOpsError(str(e), globals_opt)