Implementation:Mlc ai Web llm Embedding Input Format

Overview

Embedding_Input_Format is a Pattern Doc that documents the concrete input formatting conventions used in web-llm's embedding examples, specifically the prefix patterns and special token wrapping required by the Snowflake Arctic Embed model family. This is user-side code that must be applied before calling engine.embeddings.create().

Code Reference

Source Example: Input Preparation

From examples/embeddings/src/embeddings.ts at lines 54-67:

// Prepare inputs
const documents_og = ["The Data Cloud!", "Mexico City of Course!"];
const queries_og = ["what is snowflake?", "Where can I get the best tacos?"];
const documents: string[] = [];
const queries: string[] = [];
const query_prefix =
  "Represent this sentence for searching relevant passages: ";
// Process according to Snowflake model
documents_og.forEach(function (item, index) {
  documents[index] = `[CLS] ${item} [SEP]`;
});
queries_og.forEach(function (item, index) {
  queries[index] = `[CLS] ${query_prefix}${item} [SEP]`;
});

Formatting Rules by Model

Model Family	Role	Format Template	Example
Snowflake Arctic Embed	Query	`[CLS] Represent this sentence for searching relevant passages: {text} [SEP]`	`[CLS] Represent this sentence for searching relevant passages: what is snowflake? [SEP]`
Snowflake Arctic Embed	Document	`[CLS] {text} [SEP]`	`[CLS] The Data Cloud! [SEP]`

Key Constants

// Query prefix for Snowflake Arctic Embed models
const SNOWFLAKE_QUERY_PREFIX =
  "Represent this sentence for searching relevant passages: ";

// Special tokens
const CLS_TOKEN = "[CLS]";
const SEP_TOKEN = "[SEP]";

I/O Contract

Input:

Raw text strings (queries and/or documents)

Output:

Formatted text strings ready for engine.embeddings.create()

Format Specifications:

Input Type	Snowflake Format
Raw query `"what is X?"`	`"[CLS] Represent this sentence for searching relevant passages: what is X? [SEP]"`
Raw document `"X is a thing."`	`"[CLS] X is a thing. [SEP]"`
Symmetric similarity input	`"[CLS] some text [SEP]"` (use document format for both)

Usage Examples

Basic Query and Document Formatting

import { CreateMLCEngine } from "@mlc-ai/web-llm";

const engine = await CreateMLCEngine("snowflake-arctic-embed-m-q0f32-MLC-b4");

const QUERY_PREFIX =
  "Represent this sentence for searching relevant passages: ";

// Format and embed documents
const rawDocs = [
  "Photosynthesis converts light energy into chemical energy.",
  "The mitochondria is the powerhouse of the cell.",
  "DNA contains the genetic instructions for all living organisms.",
];
const formattedDocs = rawDocs.map((doc) => `[CLS] ${doc} [SEP]`);
const docResult = await engine.embeddings.create({ input: formattedDocs });

// Format and embed a query
const rawQuery = "How do cells produce energy?";
const formattedQuery = `[CLS] ${QUERY_PREFIX}${rawQuery} [SEP]`;
const queryResult = await engine.embeddings.create({ input: formattedQuery });

// Compute similarity scores
const queryVec = queryResult.data[0].embedding;
for (let i = 0; i < rawDocs.length; i++) {
  const docVec = docResult.data[i].embedding;
  const similarity = queryVec.reduce(
    (sum, val, idx) => sum + val * docVec[idx],
    0,
  );
  console.log(`"${rawDocs[i]}" => score: ${similarity.toFixed(4)}`);
}
// Expected: "The mitochondria is the powerhouse of the cell." has highest score

Reusable Formatting Utility

import { CreateMLCEngine, CreateEmbeddingResponse } from "@mlc-ai/web-llm";

/**
 * Formatting utility for Snowflake Arctic Embed models.
 * Encapsulates the asymmetric prefix logic.
 */
class ArcticEmbedFormatter {
  private static readonly PREFIX =
    "Represent this sentence for searching relevant passages: ";

  /**
   * Format a single text for the given role.
   */
  static format(text: string, role: "query" | "document"): string {
    if (role === "query") {
      return `[CLS] ${ArcticEmbedFormatter.PREFIX}${text} [SEP]`;
    }
    return `[CLS] ${text} [SEP]`;
  }

  /**
   * Format a batch of texts for the given role.
   */
  static formatBatch(texts: string[], role: "query" | "document"): string[] {
    return texts.map((t) => ArcticEmbedFormatter.format(t, role));
  }
}

// Usage
const engine = await CreateMLCEngine("snowflake-arctic-embed-m-q0f32-MLC-b4");

const documents = ["WebGPU is a modern API.", "TensorFlow is an ML framework."];
const queries = ["What is WebGPU?"];

const docEmbeddings: CreateEmbeddingResponse = await engine.embeddings.create({
  input: ArcticEmbedFormatter.formatBatch(documents, "document"),
});

const queryEmbeddings: CreateEmbeddingResponse = await engine.embeddings.create(
  {
    input: ArcticEmbedFormatter.formatBatch(queries, "query"),
  },
);

console.log("Document embeddings count:", docEmbeddings.data.length);
console.log("Query embeddings count:", queryEmbeddings.data.length);

LangChain Integration with Formatting

import * as webllm from "@mlc-ai/web-llm";
import type { EmbeddingsInterface } from "@langchain/core/embeddings";

// Custom LangChain embeddings wrapper that applies Snowflake formatting
class FormattedWebLLMEmbeddings implements EmbeddingsInterface {
  private engine: webllm.MLCEngineInterface;
  private modelId: string;
  private queryPrefix =
    "Represent this sentence for searching relevant passages: ";

  constructor(engine: webllm.MLCEngineInterface, modelId: string) {
    this.engine = engine;
    this.modelId = modelId;
  }

  // LangChain calls this for queries
  async embedQuery(text: string): Promise<number[]> {
    const formatted = `[CLS] ${this.queryPrefix}${text} [SEP]`;
    const reply = await this.engine.embeddings.create({
      input: [formatted],
      model: this.modelId,
    });
    return reply.data[0].embedding;
  }

  // LangChain calls this for documents
  async embedDocuments(texts: string[]): Promise<number[][]> {
    const formatted = texts.map((t) => `[CLS] ${t} [SEP]`);
    const reply = await this.engine.embeddings.create({
      input: formatted,
      model: this.modelId,
    });
    return reply.data.map((d) => d.embedding);
  }
}

// Usage with MemoryVectorStore
const engine = await webllm.CreateMLCEngine(
  "snowflake-arctic-embed-m-q0f32-MLC-b4",
);
const embeddings = new FormattedWebLLMEmbeddings(
  engine,
  "snowflake-arctic-embed-m-q0f32-MLC-b4",
);
// Pass `embeddings` to MemoryVectorStore.fromTexts() or similar

Related Pages

Principle:Mlc_ai_Web_llm_Embedding_Input_Formatting -- Principle:Mlc_ai_Web_llm_Embedding_Input_Formatting
Implementation:Mlc_ai_Web_llm_Embeddings_Create -- the API that consumes formatted inputs
Implementation:Mlc_ai_Web_llm_Cosine_Similarity_Vector_Store -- using formatted embeddings for similarity search
Implementation:Mlc_ai_Web_llm_Multi_Model_RAG_Engine -- complete RAG pipeline using formatted embeddings

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment