Implementation:Vespa engine Vespa Embedder Embed

Knowledge Sources	Vespa
Domains	NLP, Text_Processing, Machine_Learning
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for transforming text into dense vector representations (tensors) provided by Vespa's linguistics library. Defines the interface contract that all embedding model implementations must fulfill, including single-text embedding, batch embedding, token ID generation, and token decoding.

Description

The Embedder interface defines the contract for text embedding in Vespa's linguistics framework. It is a pattern doc (interface definition) rather than a concrete implementation -- concrete implementations such as ONNX-based embedders and HuggingFace tokenizer embedders implement this interface to provide actual embedding functionality.

The interface defines four core operations:

embed(String text, Context context) -> List<Integer>: Converts input text into a list of token IDs according to the embedding model's vocabulary. This is the tokenization step specific to the embedding model (not to be confused with Vespa's general-purpose tokenization).
embed(String text, Context context, TensorType tensorType) -> Tensor: The primary embedding method. Converts input text into a dense tensor (vector) of the specified type and dimensionality. This performs the full embedding pipeline: model-specific tokenization, neural network inference, and pooling.
embed(List<String> texts, Context context, TensorType tensorType) -> List<Tensor>: Batch embedding for multiple texts. The default implementation calls the single-text method in a loop, but concrete implementations can override this for batched GPU inference.
decode(List<Integer> tokens, Context context) -> String: Reverse operation that converts a list of token IDs back to human-readable text. Useful for debugging and token inspection.

Key design elements:

defaultEmbedderId: A constant string "default" used as the identifier when only one embedder is configured.
throwsOnUse: A sentinel FailingEmbedder instance that throws an exception on any method call. This is used as a placeholder when no real embedder is configured, providing clear error messages rather than null pointer exceptions.
Context: A nested class that carries metadata about the embedding request, including the destination document type, destination tensor name, and the embedding model identifier.

Usage

Use the Embedder interface when:

Configuring embedding in Vespa schemas: Declare an embedder in the application package to enable automatic embedding of text fields into tensor fields.
Implementing a custom embedder: Create a class that implements Embedder to integrate a new embedding model (e.g., a custom ONNX model or an API-based embedding service).
Programmatic embedding: Call embed() directly to generate embeddings for search queries at query time.

The Embedder interface is the extension point for all embedding functionality in Vespa. Concrete implementations include:

ONNX-based embedders that run transformer models locally.
HuggingFace tokenizer-based embedders.
ColBERT embedders for late-interaction retrieval.
Custom embedders that call external embedding APIs.

Code Reference

Source Location

Repository: Vespa
File: linguistics/src/main/java/com/yahoo/language/process/Embedder.java
Lines: 66

Signature

Tensor embed(String text, Context context, TensorType tensorType)

Interface Declaration

public interface Embedder

Package

package com.yahoo.language.process;

Key Constants

String defaultEmbedderId = "default";
Embedder throwsOnUse = new FailingEmbedder();

Full Interface Methods

// Convert text to model-specific token IDs
List<Integer> embed(String text, Context context);

// Convert text to a dense tensor embedding
Tensor embed(String text, Context context, TensorType tensorType);

// Batch embedding for multiple texts (default: sequential delegation)
default List<Tensor> embed(List<String> texts, Context context, TensorType tensorType);

// Decode token IDs back to text
default String decode(List<Integer> tokens, Context context);

Import

import com.yahoo.language.process.Embedder;

I/O Contract

Inputs (Primary embed Method)

Name	Type	Required	Description
text	`String`	Yes	The input text to embed. The text is processed by the embedding model's internal tokenizer, which may differ from Vespa's general-purpose tokenizer.
context	`Embedder.Context`	Yes	Metadata about the embedding request including: the destination document type, the target tensor field name, and the embedder identifier. Used by the embedding infrastructure for routing and caching.
tensorType	`TensorType`	Yes	The desired output tensor type, specifying the dimensionality and element type (float, bfloat16, int8, etc.) of the resulting embedding vector.

Outputs

Name	Type	Description
(return value)	`Tensor`	A dense tensor representing the embedding of the input text. The tensor conforms to the specified `tensorType` in terms of dimensions and element precision. For a typical embedding model producing 384-dimensional vectors, this would be a tensor of shape `[384]`.

Additional Method I/O

Method	Inputs	Output	Description
`embed(String, Context)`	text, context	`List<Integer>`	Returns the model-specific token IDs for the input text.
`embed(List<String>, Context, TensorType)`	texts, context, tensorType	`List<Tensor>`	Batch embeds multiple texts, returning one tensor per input text.
`decode(List<Integer>, Context)`	tokens, context	`String`	Converts token IDs back to human-readable text.

Usage Examples

Basic Embedding

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;

// Obtain an embedder instance (typically via dependency injection)
Embedder embedder = getConfiguredEmbedder();

// Create context for the embedding request
Embedder.Context context = new Embedder.Context("document_type");
context.setEmbedderId("default");

// Define the output tensor type: a 384-dimensional float vector
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");

// Generate embedding
Tensor embedding = embedder.embed("The quick brown fox jumps over the lazy dog", context, tensorType);
// embedding is a 384-dimensional dense tensor

Token ID Generation

import com.yahoo.language.process.Embedder;
import java.util.List;

Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");

// Get model-specific token IDs
List<Integer> tokenIds = embedder.embed("Hello world", context);
// tokenIds -> [101, 7592, 2088, 102]  (example BERT token IDs)

// Decode back to text
String decoded = embedder.decode(tokenIds, context);
// decoded -> "hello world"  (reconstructed from token IDs)

Batch Embedding

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;

Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");

// Embed multiple texts in a batch (more efficient than individual calls)
List<String> texts = List.of(
    "First document text",
    "Second document text",
    "Third document text"
);

List<Tensor> embeddings = embedder.embed(texts, context, tensorType);
// embeddings.size() == 3, one tensor per input text

Implementing a Custom Embedder

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;

public class MyCustomEmbedder implements Embedder {

    @Override
    public List<Integer> embed(String text, Context context) {
        // Tokenize using model-specific vocabulary
        return myTokenizer.encode(text);
    }

    @Override
    public Tensor embed(String text, Context context, TensorType tensorType) {
        // Run the full embedding pipeline
        List<Integer> tokens = embed(text, context);
        float[] vector = myModel.infer(tokens);
        return Tensor.from(tensorType, vector);
    }

    @Override
    public String decode(List<Integer> tokens, Context context) {
        return myTokenizer.decode(tokens);
    }
}

Related Pages

Implements Principle

Principle:Vespa_engine_Vespa_Embedding_Generation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment