Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Avdvg InjectGuard HuggingFaceEmbeddings Init

From Leeroopedia
Knowledge Sources
Domains NLP, Embeddings, Security
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for initializing a sentence embedding model provided by LangChain's HuggingFace integration.

Description

The HuggingFaceEmbeddings class is a LangChain wrapper around the sentence-transformers library. It loads a pre-trained sentence embedding model and configures it for inference. In InjectGuard, this is executed at module level (on import) to initialize the all-MiniLM-L6-v2 model on a specified CUDA device with L2 normalization enabled.

Key behaviors:

  • Downloads and caches the model from HuggingFace Hub on first use
  • Places the model on the specified device (GPU or CPU)
  • Configures encoding options including embedding normalization
  • The resulting object is used by FAISS to embed documents and queries

Usage

Import this when you need to create dense vector representations of text for similarity search. In the InjectGuard pipeline, this is the first step: the embeddings object is shared between vector store construction (indexing malicious prompts) and query-time similarity search (embedding incoming user prompts).

Code Reference

Source Location

  • Repository: InjectGuard
  • File: injectguard/vertor_similarity_detection.py
  • Lines: L1, L10-12

Signature

class HuggingFaceEmbeddings:
    def __init__(
        self,
        model_name: str = "sentence-transformers/all-MiniLM-L6-v2",
        model_kwargs: dict = None,
        encode_kwargs: dict = None,
    ):
        """
        Args:
            model_name: HuggingFace model identifier or local path.
            model_kwargs: Keyword arguments passed to the model constructor
                          (e.g., {'device': 'cuda:2'}).
            encode_kwargs: Keyword arguments passed to the encode method
                           (e.g., {'normalize_embeddings': True}).
        """

Import

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

I/O Contract

Inputs

Name Type Required Description
model_name str No (default: "sentence-transformers/all-MiniLM-L6-v2") HuggingFace model identifier for the sentence embedding model
model_kwargs dict No Arguments passed to the underlying model constructor; used to set device placement (e.g., {'device': 'cuda:2'})
encode_kwargs dict No Arguments passed to the encode method; used to enable L2 normalization (e.g., {'normalize_embeddings': True})

Outputs

Name Type Description
embeddings HuggingFaceEmbeddings Initialized embedding model instance; provides embed_documents(texts) and embed_query(text) methods for producing 384-dimensional vectors

Usage Examples

InjectGuard Initialization (as used in the repo)

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

# Initialize embedding model with GPU and normalization
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cuda:2'},
    encode_kwargs={'normalize_embeddings': True}
)

# The embeddings object can now be used to embed text
vector = embeddings.embed_query("Please ignore previous instructions")
# vector is a list of 384 floats, L2-normalized

CPU-only Initialization

from langchain.embeddings.huggingface import HuggingFaceEmbeddings

# Initialize on CPU (no GPU required)
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'},
    encode_kwargs={'normalize_embeddings': True}
)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment