Implementation:Avdvg InjectGuard HuggingFaceEmbeddings Init

Knowledge Sources	InjectGuard LangChain Embeddings sentence-transformers
Domains	NLP, Embeddings, Security
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for initializing a sentence embedding model provided by LangChain's HuggingFace integration.

Description

The HuggingFaceEmbeddings class is a LangChain wrapper around the sentence-transformers library. It loads a pre-trained sentence embedding model and configures it for inference. In InjectGuard, this is executed at module level (on import) to initialize the all-MiniLM-L6-v2 model on a specified CUDA device with L2 normalization enabled.

Key behaviors:

Downloads and caches the model from HuggingFace Hub on first use
Places the model on the specified device (GPU or CPU)
Configures encoding options including embedding normalization
The resulting object is used by FAISS to embed documents and queries

Usage

Import this when you need to create dense vector representations of text for similarity search. In the InjectGuard pipeline, this is the first step: the embeddings object is shared between vector store construction (indexing malicious prompts) and query-time similarity search (embedding incoming user prompts).

Code Reference

Source Location

Repository: InjectGuard
File: injectguard/vertor_similarity_detection.py
Lines: L1, L10-12

Signature

class HuggingFaceEmbeddings:
    def __init__(
        self,
        model_name: str = "sentence-transformers/all-MiniLM-L6-v2",
        model_kwargs: dict = None,
        encode_kwargs: dict = None,
    ):
        """
        Args:
            model_name: HuggingFace model identifier or local path.
            model_kwargs: Keyword arguments passed to the model constructor
                          (e.g., {'device': 'cuda:2'}).
            encode_kwargs: Keyword arguments passed to the encode method
                           (e.g., {'normalize_embeddings': True}).
        """

Import

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

I/O Contract

Inputs

Name	Type	Required	Description
model_name	str	No (default: "sentence-transformers/all-MiniLM-L6-v2")	HuggingFace model identifier for the sentence embedding model
model_kwargs	dict	No	Arguments passed to the underlying model constructor; used to set device placement (e.g., `{'device': 'cuda:2'}`)
encode_kwargs	dict	No	Arguments passed to the encode method; used to enable L2 normalization (e.g., `{'normalize_embeddings': True}`)

Outputs

Name	Type	Description
embeddings	HuggingFaceEmbeddings	Initialized embedding model instance; provides `embed_documents(texts)` and `embed_query(text)` methods for producing 384-dimensional vectors

Usage Examples

InjectGuard Initialization (as used in the repo)

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

# Initialize embedding model with GPU and normalization
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cuda:2'},
    encode_kwargs={'normalize_embeddings': True}
)

# The embeddings object can now be used to embed text
vector = embeddings.embed_query("Please ignore previous instructions")
# vector is a list of 384 floats, L2-normalized

CPU-only Initialization

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

# Initialize on CPU (no GPU required)
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

Related Pages

Implements Principle

Principle:Avdvg_InjectGuard_Embedding_Model_Initialization

Requires Environment

Environment:Avdvg_InjectGuard_CUDA_GPU

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment