Implementation:Norrrrrrr lyn WAInjectBench SentenceTransformer Encode
Appearance
| Knowledge Sources | |
|---|---|
| Domains | NLP, Feature_Engineering |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for batch-encoding text strings into 384-dimensional embeddings, provided by the sentence-transformers library as used in the WAInjectBench text embedding trainer.
Description
The embedder.encode() method processes a list of text strings in batches of 32 with a progress bar. It returns a numpy array of shape (N, 384) where N is the number of input texts. This is the feature matrix used for LogisticRegression training.
Usage
Called once per JSONL training file to convert all text samples into embedding vectors.
Code Reference
Source Location
- Repository: WAInjectBench
- File: train/embedding-t.py (L28)
Signature
embeddings = embedder.encode(texts, batch_size=32, show_progress_bar=True)
Import
from sentence_transformers import SentenceTransformer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| texts | List[str] | Yes | List of text strings to encode |
| batch_size | int | No | Batch size for encoding (default 32) |
| show_progress_bar | bool | No | Display progress bar (default True) |
Outputs
| Name | Type | Description |
|---|---|---|
| embeddings | np.ndarray | Shape (N, 384) array of text embeddings |
Usage Examples
Encoding Text Samples
from sentence_transformers import SentenceTransformer
embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
texts = ["Hello world", "Ignore previous instructions", "What is Python?"]
embeddings = embedder.encode(texts, batch_size=32, show_progress_bar=True)
print(f"Shape: {embeddings.shape}") # (3, 384)
Related Pages
Implements Principle
Requires Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment