Implementation:AnswerDotAI RAGatouille Export To Vespa ONNX
| Knowledge Sources | |
|---|---|
| Domains | Model_Export, Deployment, NLP |
| Last Updated | 2026-02-12 12:00 GMT |
Overview
Concrete tool for converting a ColBERT model checkpoint to ONNX format for deployment in Vespa search infrastructure provided by the RAGatouille library.
Description
The export_to_vespa_onnx function converts a trained ColBERT model into an ONNX file suitable for deployment in Vespa. It uses the VespaColBERT helper class, which wraps a BERT model with a linear projection layer to produce 128-dimensional normalized token embeddings. The ONNX export uses dynamic axes so the model accepts variable batch and sequence lengths at inference time, and targets ONNX opset version 17.
Usage
Use this function when deploying a trained ColBERT model to a Vespa search cluster. This enables production-scale serving of ColBERT late-interaction retrieval without requiring PyTorch at inference time. Typically called either directly or via export_to_huggingface_hub with the export_vespa_onnx=True flag.
Code Reference
Source Location
- Repository: AnswerDotAI_RAGatouille
- File: ragatouille/models/utils.py
- Lines: 103-143
Signature
class VespaColBERT(BertPreTrainedModel):
"""Wrapper model that produces normalized ColBERT embeddings for Vespa."""
def __init__(self, config, dim):
"""
Args:
config: HuggingFace BERT config.
dim: Output embedding dimension (typically 128).
"""
def forward(self, input_ids, attention_mask):
"""
Args:
input_ids: Token IDs tensor of shape (batch, seq_len).
attention_mask: Attention mask tensor of shape (batch, seq_len).
Returns:
L2-normalized embeddings of shape (batch, seq_len, dim).
"""
def export_to_vespa_onnx(
colbert_path: Union[str, Path],
out_path: Union[str, Path],
out_file_name: str = "vespa_colbert.onnx",
) -> None:
"""
Export a ColBERT checkpoint to Vespa-compatible ONNX format.
Args:
colbert_path: Path to the ColBERT model checkpoint.
out_path: Directory to write the ONNX file.
out_file_name: Output filename (default: 'vespa_colbert.onnx').
"""
Import
from ragatouille.models.utils import export_to_vespa_onnx
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| colbert_path | Union[str, Path] | Yes | Path to the ColBERT model checkpoint directory |
| out_path | Union[str, Path] | Yes | Directory where the ONNX file will be written |
| out_file_name | str | No | Name of the output ONNX file (default: 'vespa_colbert.onnx') |
Outputs
| Name | Type | Description |
|---|---|---|
| return | None | No return value; prints status messages to stdout |
| side effect | ONNX file | Writes '{out_file_name}' to '{out_path}' directory |
Usage Examples
Direct ONNX Export
from ragatouille.models.utils import export_to_vespa_onnx
# Convert a ColBERT checkpoint to Vespa ONNX format
export_to_vespa_onnx(
colbert_path="experiments/colbert/none/2024-01/checkpoints/colbert",
out_path="./vespa_export",
out_file_name="vespa_colbert.onnx",
)
# Output file: ./vespa_export/vespa_colbert.onnx
Via Hugging Face Hub Export
from ragatouille.models.utils import export_to_huggingface_hub
# Export to both HF Hub and Vespa ONNX in one step
export_to_huggingface_hub(
colbert_path="experiments/colbert/none/2024-01/checkpoints/colbert",
huggingface_repo_name="myuser/my-colbert-model",
export_vespa_onnx=True, # Also produce Vespa ONNX
use_tmp_dir=True,
)