Heuristic:Avdvg InjectGuard Module Level Initialization
| Knowledge Sources | |
|---|---|
| Domains | Python, Architecture, Optimization |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Importing the InjectGuard module triggers GPU allocation, model download, dataset loading, and FAISS index construction as side effects, causing significant latency on first import.
Description
The InjectGuard module executes three heavy operations at module scope (outside any function or class):
- HuggingFaceEmbeddings initialization (L10-12): Downloads and loads the sentence-transformers model onto the CUDA GPU.
- CSVLoader dataset loading (L25-26): Reads the malicious prompt CSV from disk into memory.
- FAISS vector store construction (L47): Embeds all documents and builds the FAISS index.
These operations run immediately when Python executes import injectguard.vertor_similarity_detection or any from ... import ... referencing that module. There is no lazy initialization or factory pattern — the module is "live" as soon as it is imported.
Usage
Be aware of this behavior when integrating InjectGuard into a larger application. The first import will block the calling thread for several seconds (model download on first run, plus embedding and indexing). Plan for this startup cost in application initialization. Do not import this module in hot paths or test setups where the GPU/dataset may not be available.
The Insight (Rule of Thumb)
- Action: Import the module once during application startup, not on-demand per request.
- Value: Expect 5-30 seconds of initialization time depending on whether the model is cached and dataset size.
- Trade-off: Module-level init provides a simple programming model (just call
sim_search) but removes control over when resources are allocated. The GPU memory and model weights remain allocated for the process lifetime. - Workaround: To defer initialization, wrap the import in a function or use
importlib.import_module()at the desired time.
Reasoning
Module-level initialization is a common Python pattern for singletons and shared resources, but it becomes a "gotcha" when those resources are expensive. In this case, three distinct I/O and compute-bound operations happen silently:
Model initialization from vertor_similarity_detection.py:10-12:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2",
model_kwargs={'device': 'cuda:2'},
encode_kwargs={'normalize_embeddings':True})
Dataset loading from vertor_similarity_detection.py:25-26:
loader = CSVLoader(file_path='./dataset/malicious_data_demo.csv')
docs = loader.load()
Index construction from vertor_similarity_detection.py:47:
vector_store = FAISS.from_documents(docs, embeddings)
All three execute at import time. If the CUDA device is unavailable, the CSV file is missing, or the HuggingFace model cannot be downloaded, the import itself will raise an exception — not the function call.