Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:AnswerDotAI RAGatouille Export To Vespa ONNX

From Leeroopedia
Knowledge Sources
Domains Model_Export, Deployment, NLP
Last Updated 2026-02-12 12:00 GMT

Overview

Concrete tool for converting a ColBERT model checkpoint to ONNX format for deployment in Vespa search infrastructure provided by the RAGatouille library.

Description

The export_to_vespa_onnx function converts a trained ColBERT model into an ONNX file suitable for deployment in Vespa. It uses the VespaColBERT helper class, which wraps a BERT model with a linear projection layer to produce 128-dimensional normalized token embeddings. The ONNX export uses dynamic axes so the model accepts variable batch and sequence lengths at inference time, and targets ONNX opset version 17.

Usage

Use this function when deploying a trained ColBERT model to a Vespa search cluster. This enables production-scale serving of ColBERT late-interaction retrieval without requiring PyTorch at inference time. Typically called either directly or via export_to_huggingface_hub with the export_vespa_onnx=True flag.

Code Reference

Source Location

Signature

class VespaColBERT(BertPreTrainedModel):
    """Wrapper model that produces normalized ColBERT embeddings for Vespa."""
    def __init__(self, config, dim):
        """
        Args:
            config: HuggingFace BERT config.
            dim: Output embedding dimension (typically 128).
        """

    def forward(self, input_ids, attention_mask):
        """
        Args:
            input_ids: Token IDs tensor of shape (batch, seq_len).
            attention_mask: Attention mask tensor of shape (batch, seq_len).
        Returns:
            L2-normalized embeddings of shape (batch, seq_len, dim).
        """


def export_to_vespa_onnx(
    colbert_path: Union[str, Path],
    out_path: Union[str, Path],
    out_file_name: str = "vespa_colbert.onnx",
) -> None:
    """
    Export a ColBERT checkpoint to Vespa-compatible ONNX format.

    Args:
        colbert_path: Path to the ColBERT model checkpoint.
        out_path: Directory to write the ONNX file.
        out_file_name: Output filename (default: 'vespa_colbert.onnx').
    """

Import

from ragatouille.models.utils import export_to_vespa_onnx

I/O Contract

Inputs

Name Type Required Description
colbert_path Union[str, Path] Yes Path to the ColBERT model checkpoint directory
out_path Union[str, Path] Yes Directory where the ONNX file will be written
out_file_name str No Name of the output ONNX file (default: 'vespa_colbert.onnx')

Outputs

Name Type Description
return None No return value; prints status messages to stdout
side effect ONNX file Writes '{out_file_name}' to '{out_path}' directory

Usage Examples

Direct ONNX Export

from ragatouille.models.utils import export_to_vespa_onnx

# Convert a ColBERT checkpoint to Vespa ONNX format
export_to_vespa_onnx(
    colbert_path="experiments/colbert/none/2024-01/checkpoints/colbert",
    out_path="./vespa_export",
    out_file_name="vespa_colbert.onnx",
)
# Output file: ./vespa_export/vespa_colbert.onnx

Via Hugging Face Hub Export

from ragatouille.models.utils import export_to_huggingface_hub

# Export to both HF Hub and Vespa ONNX in one step
export_to_huggingface_hub(
    colbert_path="experiments/colbert/none/2024-01/checkpoints/colbert",
    huggingface_repo_name="myuser/my-colbert-model",
    export_vespa_onnx=True,   # Also produce Vespa ONNX
    use_tmp_dir=True,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment