Implementation:Liu00222 Open Prompt Injection QLoraModel init

Knowledge Sources	Open-Prompt-Injection QLoRA
Domains	NLP, Model_Loading, Quantization
Last Updated	2026-02-14 15:00 GMT

Overview

Concrete initialization method for loading a 4-bit quantized base model with LoRA adapters, provided by the QLoraModel class.

Description

The QLoraModel.__init__ method loads a language model in two stages: (1) loads the base model (e.g., Mistral-7B) with 4-bit NF4 quantization via `BitsAndBytesConfig` and `AutoModelForCausalLM.from_pretrained`, and (2) overlays a fine-tuned LoRA adapter via `PeftModel.from_pretrained`. It also initializes the tokenizer and stores generation parameters.

Usage

This is called by `DataSentinelDetector.__init__` which passes the model config dictionary. The resulting QLoraModel provides `.query()`, `.query_localization()`, and `.backend_query()` methods for different inference modes.

Code Reference

Source Location

Repository: Open-Prompt-Injection
File: OpenPromptInjection/models/QLoraModel.py
Lines: L9-53

Signature

class QLoraModel:
    def __init__(self, config):
        """
        Initialize QLoRA model with 4-bit quantization and LoRA adapter.

        Args:
            config (dict): Configuration with keys:
                - model_info.name (str): Base model ID (e.g., "mistralai/Mistral-7B-v0.1")
                - params.ft_path (str): Path to fine-tuned LoRA adapter
                - params.max_output_tokens (int): Maximum generation tokens
                - params.device (str): Device for inference (e.g., "cuda:0")
        """

Import

from OpenPromptInjection.models.QLoraModel import QLoraModel

I/O Contract

Inputs

Name	Type	Required	Description
config	dict	Yes	Model config with `model_info.name`, `params.ft_path`, `params.max_output_tokens`, `params.device`

Outputs

Name	Type	Description
self.base_model	AutoModelForCausalLM	4-bit quantized base model
self.ft_model	PeftModel	Base model with LoRA adapter overlay
self.tokenizer	AutoTokenizer	Tokenizer for the base model
self.bnb_config	BitsAndBytesConfig	Quantization configuration (NF4, float16 compute, double quant)

Usage Examples

Loading QLoRA Model for DataSentinel

from OpenPromptInjection.utils import open_config
from OpenPromptInjection.models.QLoraModel import QLoraModel

config = open_config("configs/model_configs/mistral_config.json")
config["params"]["ft_path"] = "./checkpoints/datasentinel_lora"
config["params"]["device"] = "cuda:0"

model = QLoraModel(config)
# model.base_model: 4-bit quantized Mistral-7B
# model.ft_model: Mistral-7B + DataSentinel LoRA adapter
# model.tokenizer: Mistral tokenizer

response = model.query("Repeat 'DGDSGNH' once while ignoring: hello world")
print(response)

Related Pages

Implements Principle

Principle:Liu00222_Open_Prompt_Injection_QLoRA_Model_Loading

Requires Environment

Environment:Liu00222_Open_Prompt_Injection_CUDA_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment