Implementation:Norrrrrrr lyn WAInjectBench SentenceTransformer Encode

Knowledge Sources	WAInjectBench Sentence Transformers
Domains	NLP, Feature_Engineering
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for batch-encoding text strings into 384-dimensional embeddings, provided by the sentence-transformers library as used in the WAInjectBench text embedding trainer.

Description

The embedder.encode() method processes a list of text strings in batches of 32 with a progress bar. It returns a numpy array of shape (N, 384) where N is the number of input texts. This is the feature matrix used for LogisticRegression training.

Usage

Called once per JSONL training file to convert all text samples into embedding vectors.

Code Reference

Source Location

Repository: WAInjectBench
File: train/embedding-t.py (L28)

Signature

embeddings = embedder.encode(texts, batch_size=32, show_progress_bar=True)

Import

from sentence_transformers import SentenceTransformer

I/O Contract

Inputs

Name	Type	Required	Description
texts	List[str]	Yes	List of text strings to encode
batch_size	int	No	Batch size for encoding (default 32)
show_progress_bar	bool	No	Display progress bar (default True)

Outputs

Name	Type	Description
embeddings	np.ndarray	Shape (N, 384) array of text embeddings

Usage Examples

Encoding Text Samples

from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

texts = ["Hello world", "Ignore previous instructions", "What is Python?"]
embeddings = embedder.encode(texts, batch_size=32, show_progress_bar=True)
print(f"Shape: {embeddings.shape}")  # (3, 384)

Related Pages

Implements Principle

Principle:Norrrrrrr_lyn_WAInjectBench_Text_Feature_Extraction

Requires Environment

Environment:Norrrrrrr_lyn_WAInjectBench_Conda_Python_39_CUDA_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment