Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Vespa engine Vespa Embedder Embed

From Leeroopedia


Knowledge Sources
Domains NLP, Text_Processing, Machine_Learning
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for transforming text into dense vector representations (tensors) provided by Vespa's linguistics library. Defines the interface contract that all embedding model implementations must fulfill, including single-text embedding, batch embedding, token ID generation, and token decoding.

Description

The Embedder interface defines the contract for text embedding in Vespa's linguistics framework. It is a pattern doc (interface definition) rather than a concrete implementation -- concrete implementations such as ONNX-based embedders and HuggingFace tokenizer embedders implement this interface to provide actual embedding functionality.

The interface defines four core operations:

  1. embed(String text, Context context) -> List<Integer>: Converts input text into a list of token IDs according to the embedding model's vocabulary. This is the tokenization step specific to the embedding model (not to be confused with Vespa's general-purpose tokenization).
  2. embed(String text, Context context, TensorType tensorType) -> Tensor: The primary embedding method. Converts input text into a dense tensor (vector) of the specified type and dimensionality. This performs the full embedding pipeline: model-specific tokenization, neural network inference, and pooling.
  3. embed(List<String> texts, Context context, TensorType tensorType) -> List<Tensor>: Batch embedding for multiple texts. The default implementation calls the single-text method in a loop, but concrete implementations can override this for batched GPU inference.
  4. decode(List<Integer> tokens, Context context) -> String: Reverse operation that converts a list of token IDs back to human-readable text. Useful for debugging and token inspection.

Key design elements:

  • defaultEmbedderId: A constant string "default" used as the identifier when only one embedder is configured.
  • throwsOnUse: A sentinel FailingEmbedder instance that throws an exception on any method call. This is used as a placeholder when no real embedder is configured, providing clear error messages rather than null pointer exceptions.
  • Context: A nested class that carries metadata about the embedding request, including the destination document type, destination tensor name, and the embedding model identifier.

Usage

Use the Embedder interface when:

  • Configuring embedding in Vespa schemas: Declare an embedder in the application package to enable automatic embedding of text fields into tensor fields.
  • Implementing a custom embedder: Create a class that implements Embedder to integrate a new embedding model (e.g., a custom ONNX model or an API-based embedding service).
  • Programmatic embedding: Call embed() directly to generate embeddings for search queries at query time.

The Embedder interface is the extension point for all embedding functionality in Vespa. Concrete implementations include:

  • ONNX-based embedders that run transformer models locally.
  • HuggingFace tokenizer-based embedders.
  • ColBERT embedders for late-interaction retrieval.
  • Custom embedders that call external embedding APIs.

Code Reference

Source Location

  • Repository: Vespa
  • File: linguistics/src/main/java/com/yahoo/language/process/Embedder.java
  • Lines: 66

Signature

Tensor embed(String text, Context context, TensorType tensorType)

Interface Declaration

public interface Embedder

Package

package com.yahoo.language.process;

Key Constants

String defaultEmbedderId = "default";
Embedder throwsOnUse = new FailingEmbedder();

Full Interface Methods

// Convert text to model-specific token IDs
List<Integer> embed(String text, Context context);

// Convert text to a dense tensor embedding
Tensor embed(String text, Context context, TensorType tensorType);

// Batch embedding for multiple texts (default: sequential delegation)
default List<Tensor> embed(List<String> texts, Context context, TensorType tensorType);

// Decode token IDs back to text
default String decode(List<Integer> tokens, Context context);

Import

import com.yahoo.language.process.Embedder;

I/O Contract

Inputs (Primary embed Method)

Name Type Required Description
text String Yes The input text to embed. The text is processed by the embedding model's internal tokenizer, which may differ from Vespa's general-purpose tokenizer.
context Embedder.Context Yes Metadata about the embedding request including: the destination document type, the target tensor field name, and the embedder identifier. Used by the embedding infrastructure for routing and caching.
tensorType TensorType Yes The desired output tensor type, specifying the dimensionality and element type (float, bfloat16, int8, etc.) of the resulting embedding vector.

Outputs

Name Type Description
(return value) Tensor A dense tensor representing the embedding of the input text. The tensor conforms to the specified tensorType in terms of dimensions and element precision. For a typical embedding model producing 384-dimensional vectors, this would be a tensor of shape [384].

Additional Method I/O

Method Inputs Output Description
embed(String, Context) text, context List<Integer> Returns the model-specific token IDs for the input text.
embed(List<String>, Context, TensorType) texts, context, tensorType List<Tensor> Batch embeds multiple texts, returning one tensor per input text.
decode(List<Integer>, Context) tokens, context String Converts token IDs back to human-readable text.

Usage Examples

Basic Embedding

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;

// Obtain an embedder instance (typically via dependency injection)
Embedder embedder = getConfiguredEmbedder();

// Create context for the embedding request
Embedder.Context context = new Embedder.Context("document_type");
context.setEmbedderId("default");

// Define the output tensor type: a 384-dimensional float vector
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");

// Generate embedding
Tensor embedding = embedder.embed("The quick brown fox jumps over the lazy dog", context, tensorType);
// embedding is a 384-dimensional dense tensor

Token ID Generation

import com.yahoo.language.process.Embedder;
import java.util.List;

Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");

// Get model-specific token IDs
List<Integer> tokenIds = embedder.embed("Hello world", context);
// tokenIds -> [101, 7592, 2088, 102]  (example BERT token IDs)

// Decode back to text
String decoded = embedder.decode(tokenIds, context);
// decoded -> "hello world"  (reconstructed from token IDs)

Batch Embedding

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;

Embedder embedder = getConfiguredEmbedder();
Embedder.Context context = new Embedder.Context("document_type");
TensorType tensorType = TensorType.fromSpec("tensor<float>(x[384])");

// Embed multiple texts in a batch (more efficient than individual calls)
List<String> texts = List.of(
    "First document text",
    "Second document text",
    "Third document text"
);

List<Tensor> embeddings = embedder.embed(texts, context, tensorType);
// embeddings.size() == 3, one tensor per input text

Implementing a Custom Embedder

import com.yahoo.language.process.Embedder;
import com.yahoo.tensor.Tensor;
import com.yahoo.tensor.TensorType;
import java.util.List;

public class MyCustomEmbedder implements Embedder {

    @Override
    public List<Integer> embed(String text, Context context) {
        // Tokenize using model-specific vocabulary
        return myTokenizer.encode(text);
    }

    @Override
    public Tensor embed(String text, Context context, TensorType tensorType) {
        // Run the full embedding pipeline
        List<Integer> tokens = embed(text, context);
        float[] vector = myModel.infer(tokens);
        return Tensor.from(tensorType, vector);
    }

    @Override
    public String decode(List<Integer> tokens, Context context) {
        return myTokenizer.decode(tokens);
    }
}

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment