Implementation:Mlc ai Web llm Cosine Similarity Vector Store
Overview
Cosine_Similarity_Vector_Store is an External Tool Doc that documents two approaches for computing cosine similarity and performing vector search with web-llm embeddings: manual dot product computation and integration with LangChain's MemoryVectorStore. Neither approach is implemented within web-llm itself; both are user-level patterns demonstrated in the official examples.
Code Reference
Approach 1: Manual Dot Product Computation
From examples/embeddings/src/embeddings.ts at lines 94-108, the example demonstrates computing pairwise similarity using the MemoryVectorStore.similarity() method:
// Calculate similarity (we use langchain here, but any method works)
const vectorStore = await MemoryVectorStore.fromExistingIndex(
new WebLLMEmbeddings(engine, selectedModel),
);
// See score
for (let i = 0; i < queries_og.length; i++) {
console.log(`Similarity with: ${queries_og[i]}`);
for (let j = 0; j < documents_og.length; j++) {
const similarity = vectorStore.similarity(
queryReply.data[i].embedding,
docReply.data[j].embedding,
);
console.log(`${documents_og[j]}: ${similarity}`);
}
}
A pure JavaScript implementation without LangChain:
// Pure dot product (equivalent to cosine similarity for L2-normalized vectors)
function cosineSimilarity(a: number[], b: number[]): number {
let dot = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
}
return dot;
}
Approach 2: LangChain MemoryVectorStore Integration
From examples/embeddings/src/embeddings.ts at lines 112-154:
// LangChain EmbeddingsInterface adapter for web-llm
class WebLLMEmbeddings implements EmbeddingsInterface {
engine: webllm.MLCEngineInterface;
modelId: string;
constructor(engine: webllm.MLCEngineInterface, modelId: string) {
this.engine = engine;
this.modelId = modelId;
}
async _embed(texts: string[]): Promise<number[][]> {
const reply = await this.engine.embeddings.create({
input: texts,
model: this.modelId,
});
const result: number[][] = [];
for (let i = 0; i < texts.length; i++) {
result.push(reply.data[i].embedding);
}
return result;
}
async embedQuery(document: string): Promise<number[]> {
return this._embed([document]).then((embeddings) => embeddings[0]);
}
async embedDocuments(documents: string[]): Promise<number[][]> {
return this._embed(documents);
}
}
This adapter bridges web-llm's engine.embeddings.create() to LangChain's EmbeddingsInterface, enabling seamless use with any LangChain vector store.
External Dependencies
| Package | Purpose | Required? |
|---|---|---|
langchain |
MemoryVectorStore, formatDocumentsAsString |
Optional (only for LangChain integration) |
@langchain/core |
EmbeddingsInterface, Document, PromptTemplate, RunnableSequence |
Optional (only for LangChain integration) |
Install with:
npm install langchain @langchain/core
I/O Contract
Manual Dot Product
Input:
- Two
number[]vectors of equal length (the embedding dimension)
Output:
- A single
numberrepresenting the cosine similarity (range: [-1, 1] for L2-normalized vectors)
MemoryVectorStore.similaritySearch()
Input:
query: string-- the query text (will be embedded viaembedQuery())k: number-- number of results to return
Output:
Document[]-- array of LangChain Document objects sorted by descending similarity
MemoryVectorStore.similarity()
Input:
- Two
number[]vectors
Output:
number-- cosine similarity score
Usage Examples
Complete Manual Vector Search
import { CreateMLCEngine, CreateEmbeddingResponse } from "@mlc-ai/web-llm";
const engine = await CreateMLCEngine("snowflake-arctic-embed-m-q0f32-MLC-b4");
const QUERY_PREFIX =
"Represent this sentence for searching relevant passages: ";
// Build a simple in-memory vector index
interface VectorDocument {
text: string;
embedding: number[];
}
class SimpleVectorStore {
private documents: VectorDocument[] = [];
add(text: string, embedding: number[]): void {
this.documents.push({ text, embedding });
}
search(
queryEmbedding: number[],
topK: number,
): Array<{ text: string; score: number }> {
const results = this.documents.map((doc) => ({
text: doc.text,
score: this.dotProduct(queryEmbedding, doc.embedding),
}));
results.sort((a, b) => b.score - a.score);
return results.slice(0, topK);
}
private dotProduct(a: number[], b: number[]): number {
let sum = 0;
for (let i = 0; i < a.length; i++) {
sum += a[i] * b[i];
}
return sum;
}
}
// Index documents
const store = new SimpleVectorStore();
const docs = [
"JavaScript is a programming language for the web.",
"Python is popular for data science and machine learning.",
"Rust provides memory safety without garbage collection.",
"WebGPU enables GPU-accelerated computation in browsers.",
];
const docFormatted = docs.map((d) => `[CLS] ${d} [SEP]`);
const docEmbeddings: CreateEmbeddingResponse = await engine.embeddings.create({
input: docFormatted,
});
for (let i = 0; i < docs.length; i++) {
store.add(docs[i], docEmbeddings.data[i].embedding);
}
// Search
const query = "GPU computing in web browsers";
const queryFormatted = `[CLS] ${QUERY_PREFIX}${query} [SEP]`;
const queryEmbedding: CreateEmbeddingResponse = await engine.embeddings.create({
input: queryFormatted,
});
const results = store.search(queryEmbedding.data[0].embedding, 2);
for (const r of results) {
console.log(`[${r.score.toFixed(4)}] ${r.text}`);
}
LangChain MemoryVectorStore with Document Addition
import * as webllm from "@mlc-ai/web-llm";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import type { EmbeddingsInterface } from "@langchain/core/embeddings";
import type { Document } from "@langchain/core/documents";
class WebLLMEmbeddings implements EmbeddingsInterface {
engine: webllm.MLCEngineInterface;
modelId: string;
constructor(engine: webllm.MLCEngineInterface, modelId: string) {
this.engine = engine;
this.modelId = modelId;
}
async embedQuery(text: string): Promise<number[]> {
const reply = await this.engine.embeddings.create({
input: [text],
model: this.modelId,
});
return reply.data[0].embedding;
}
async embedDocuments(texts: string[]): Promise<number[][]> {
const reply = await this.engine.embeddings.create({
input: texts,
model: this.modelId,
});
return reply.data.map((d) => d.embedding);
}
}
const modelId = "snowflake-arctic-embed-m-q0f32-MLC-b4";
const engine = await webllm.CreateMLCEngine(modelId);
const embeddings = new WebLLMEmbeddings(engine, modelId);
// Create store and add documents incrementally
const vectorStore = await MemoryVectorStore.fromExistingIndex(embeddings);
const documents: Document[] = [
{ pageContent: "[CLS] The Data Cloud! [SEP]", metadata: { source: "doc1" } },
{
pageContent: "[CLS] Mexico City of Course! [SEP]",
metadata: { source: "doc2" },
},
];
await vectorStore.addDocuments(documents);
// Perform similarity search
const prefix =
"Represent this sentence for searching relevant passages: ";
const searchResults = await vectorStore.similaritySearch(
`[CLS] ${prefix}what is snowflake? [SEP]`,
1,
);
console.log("Most similar:", searchResults[0].pageContent);
console.log("Metadata:", searchResults[0].metadata);
Similarity Score Matrix
import { CreateMLCEngine } from "@mlc-ai/web-llm";
const engine = await CreateMLCEngine("snowflake-arctic-embed-m-q0f32-MLC-b4");
// Compute a full similarity matrix between two sets of texts
async function similarityMatrix(
queriesFormatted: string[],
docsFormatted: string[],
): Promise<number[][]> {
const qResult = await engine.embeddings.create({ input: queriesFormatted });
const dResult = await engine.embeddings.create({ input: docsFormatted });
const matrix: number[][] = [];
for (let i = 0; i < qResult.data.length; i++) {
const row: number[] = [];
for (let j = 0; j < dResult.data.length; j++) {
let dot = 0;
const qVec = qResult.data[i].embedding;
const dVec = dResult.data[j].embedding;
for (let k = 0; k < qVec.length; k++) {
dot += qVec[k] * dVec[k];
}
row.push(dot);
}
matrix.push(row);
}
return matrix;
}
const PREFIX = "Represent this sentence for searching relevant passages: ";
const queries = [
`[CLS] ${PREFIX}what is snowflake? [SEP]`,
`[CLS] ${PREFIX}best tacos? [SEP]`,
];
const docs = [
"[CLS] The Data Cloud! [SEP]",
"[CLS] Mexico City of Course! [SEP]",
];
const matrix = await similarityMatrix(queries, docs);
console.log("Similarity matrix:");
console.log(matrix);
// Expected: matrix[0][0] > matrix[0][1] (snowflake -> Data Cloud)
// Expected: matrix[1][1] > matrix[1][0] (tacos -> Mexico City)
Related Pages
- Principle:Mlc_ai_Web_llm_Cosine_Similarity_Search -- Principle:Mlc_ai_Web_llm_Cosine_Similarity_Search
- Implementation:Mlc_ai_Web_llm_Embeddings_Create -- generating the embedding vectors
- Implementation:Mlc_ai_Web_llm_Embedding_Input_Format -- formatting inputs before embedding
- Implementation:Mlc_ai_Web_llm_Multi_Model_RAG_Engine -- complete RAG pipeline using vector search