Implementation:Norrrrrrr lyn WAInjectBench SentenceTransformer Init
Appearance
| Knowledge Sources | |
|---|---|
| Domains | NLP, Representation_Learning |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for loading the all-MiniLM-L6-v2 sentence embedding model, provided by the sentence-transformers library as used in the WAInjectBench text embedding trainer.
Description
The text embedding training script initializes a SentenceTransformer instance with the model ID "sentence-transformers/all-MiniLM-L6-v2". This model automatically downloads from HuggingFace Hub on first use and produces 384-dimensional embeddings for input text.
Usage
Initialize once before processing multiple JSONL training files. The model is shared across all classifier training runs in a single invocation.
Code Reference
Source Location
- Repository: WAInjectBench
- File: train/embedding-t.py (L58)
Signature
embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
Import
from sentence_transformers import SentenceTransformer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | str | Yes | HuggingFace model ID (hardcoded: "sentence-transformers/all-MiniLM-L6-v2") |
Outputs
| Name | Type | Description |
|---|---|---|
| embedder | SentenceTransformer | Loaded model instance ready for encode() calls, producing 384-dim vectors |
Usage Examples
Initializing the Text Embedder
from sentence_transformers import SentenceTransformer
# Load the model (downloads on first use)
embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
# Verify output dimensionality
test_emb = embedder.encode(["test sentence"])
print(f"Embedding dim: {test_emb.shape[1]}") # 384
Related Pages
Implements Principle
Requires Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment