Implementation:Vespa engine Vespa Embedder Embed
| Knowledge Sources | |
|---|---|
| Domains | NLP, Text_Processing, Machine_Learning |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for transforming text into dense vector representations (tensors) provided by Vespa's linguistics library. Defines the interface contract that all embedding model implementations must fulfill, including single-text embedding, batch embedding, token ID generation, and token decoding.
Description
The Embedder interface defines the contract for text embedding in Vespa's linguistics framework. It is a pattern doc (interface definition) rather than a concrete implementation -- concrete implementations such as ONNX-based embedders and HuggingFace tokenizer embedders implement this interface to provide actual embedding functionality.
The interface defines four core operations:
embed(String text, Context context) -> List<Integer>: Converts input text into a list of token IDs according to the embedding model's vocabulary. This is the tokenization step specific to the embedding model (not to be confused with Vespa's general-purpose tokenization).embed(String text, Context context, TensorType tensorType) -> Tensor: The primary embedding method. Converts input text into a dense tensor (vector) of the specified type and dimensionality. This performs the full embedding pipeline: model-specific tokenization, neural network inference, and pooling.embed(List<String> texts, Context context, TensorType tensorType) -> List<Tensor>: Batch embedding for multiple texts. The default implementation calls the single-text method in a loop, but concrete implementations can override this for batched GPU inference.decode(List<Integer> tokens, Context context) -> String: Reverse operation that converts a list of token IDs back to human-readable text. Useful for debugging and token inspection.
Key design elements:
defaultEmbedderId: A constant string"default"used as the identifier when only one embedder is configured.throwsOnUse: A sentinelFailingEmbedderinstance that throws an exception on any method call. This is used as a placeholder when no real embedder is configured, providing clear error messages rather than null pointer exceptions.Context: A nested class that carries metadata about the embedding request, including the destination document type, destination tensor name, and the embedding model identifier.
Usage
Use the Embedder interface when:
- Configuring embedding in Vespa schemas: Declare an embedder in the application package to enable automatic embedding of text fields into tensor fields.
- Implementing a custom embedder: Create a class that implements
Embedderto integrate a new embedding model (e.g., a custom ONNX model or an API-based embedding service). - Programmatic embedding: Call
embed()directly to generate embeddings for search queries at query time.
The Embedder interface is the extension point for all embedding functionality in Vespa. Concrete implementations include:
- ONNX-based embedders that run transformer models locally.
- HuggingFace tokenizer-based embedders.
- ColBERT embedders for late-interaction retrieval.
- Custom embedders that call external embedding APIs.
Code Reference
Source Location
- Repository: Vespa
- File:
linguistics/src/main/java/com/yahoo/language/process/Embedder.java - Lines: 66
Signature
Tensor embed(String text, Context context, TensorType tensorType)
Interface Declaration
public interface Embedder
Package
package com.yahoo.language.process;
Key Constants
String defaultEmbedderId = "default";
Embedder throwsOnUse = new FailingEmbedder();
Full Interface Methods
// Convert text to model-specific token IDs
List<Integer> embed(String text, Context context);
// Convert text to a dense tensor embedding
Tensor embed(String text, Context context, TensorType tensorType);
// Batch embedding for multiple texts (default: sequential delegation)
default List<Tensor> embed(List<String> texts, Context context, TensorType tensorType);
// Decode token IDs back to text
default String decode(List<Integer> tokens, Context context);
Import
import com.yahoo.language.process.Embedder;
I/O Contract
Inputs (Primary embed Method)
| Name | Type | Required | Description |
|---|---|---|---|
| text | String |
Yes | The input text to embed. The text is processed by the embedding model's internal tokenizer, which may differ from Vespa's general-purpose tokenizer. |
| context | Embedder.Context |
Yes | Metadata about the embedding request including: the destination document type, the target tensor field name, and the embedder identifier. Used by the embedding infrastructure for routing and caching. |
| tensorType | TensorType |
Yes | The desired output tensor type, specifying the dimensionality and element type (float, bfloat16, int8, etc.) of the resulting embedding vector. |
Outputs
| Name | Type | Description |
|---|---|---|
| (return value) | Tensor |
A dense tensor representing the embedding of the input text. The tensor conforms to the specified tensorType in terms of dimensions and element precision. For a typical embedding model producing 384-dimensional vectors, this would be a tensor of shape [384].
|
Additional Method I/O
| Method | Inputs | Output | Description |
|---|---|---|---|
embed(String, Context) |
text, context | List<Integer> |
Returns the model-specific token IDs for the input text. |
embed(List<String>, Context, TensorType) |
texts, context, tensorType | List<Tensor> |
Batch embeds multiple texts, returning one tensor per input text. |
decode(List<Integer>, Context) |
tokens, context | String |
Converts token IDs back to human-readable text. |
Usage Examples
Basic Embedding
import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
// Obtain an embedder instance (typically via dependency injection)
Embedder embedder = getConfiguredEmbedder();
// Create context for the embedding request
Embedder.Context context = new Embedder.Context("document_type");
context.setEmbedderId("default");
// Define the output tensor type: a 384-dimensional float vector
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");
// Generate embedding
Tensor embedding = embedder.embed("The quick brown fox jumps over the lazy dog", context, tensorType);
// embedding is a 384-dimensional dense tensor
Token ID Generation
import com.yahoo.language.process.Embedder;
import java.util.List;
Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");
// Get model-specific token IDs
List<Integer> tokenIds = embedder.embed("Hello world", context);
// tokenIds -> [101, 7592, 2088, 102] (example BERT token IDs)
// Decode back to text
String decoded = embedder.decode(tokenIds, context);
// decoded -> "hello world" (reconstructed from token IDs)
Batch Embedding
import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;
Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");
// Embed multiple texts in a batch (more efficient than individual calls)
List<String> texts = List.of(
"First document text",
"Second document text",
"Third document text"
);
List<Tensor> embeddings = embedder.embed(texts, context, tensorType);
// embeddings.size() == 3, one tensor per input text
Implementing a Custom Embedder
import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;
public class MyCustomEmbedder implements Embedder {
@Override
public List<Integer> embed(String text, Context context) {
// Tokenize using model-specific vocabulary
return myTokenizer.encode(text);
}
@Override
public Tensor embed(String text, Context context, TensorType tensorType) {
// Run the full embedding pipeline
List<Integer> tokens = embed(text, context);
float[] vector = myModel.infer(tokens);
return Tensor.from(tensorType, vector);
}
@Override
public String decode(List<Integer> tokens, Context context) {
return myTokenizer.decode(tokens);
}
}