Principle:Mlc ai Web llm Page Content Access

Overview

Pattern for accessing web page content from a Chrome Extension to use as context for LLM inference. This enables use cases such as page summarization, question answering about page content, and context-aware AI assistance. The pattern uses Chrome's content scripts API to inject a script into the page DOM that extracts text content and sends it to the extension popup via chrome.runtime port-based messaging.

Description

Page content access in web-llm Chrome extensions follows a three-component message passing pattern:

1. Content Script (injected into web pages): A small JavaScript file declared in manifest.json under content_scripts. This script runs in the context of every matched web page and has access to the page's DOM. It listens for incoming port connections from the popup and responds with the page's text content.

2. Popup Script (initiator): The popup script uses chrome.tabs.connect() to establish a port connection to the content script running in the active tab. It sends an empty message to trigger the content script, which responds with the extracted page text.

3. LLM Context Injection: Once the popup has the page text, it can prepend it to the user's message as context for the LLM. The repository examples show two approaches:

The service worker example (chrome-extension-webgpu-service-worker) stores the page contents and logs them, with a useContext flag that is set to false by default
The non-service-worker example (chrome-extension) stores the content in a context variable and uses it to construct a RAG-style prompt: "Use only the following context when answering the question..."

Security considerations: Content scripts run in an isolated world with access to the page DOM but not the page's JavaScript context. This means:

They can read document.body.innerText or document.body.innerHTML
They cannot access JavaScript variables or functions defined by the page
Communication with the popup is restricted to the Chrome messaging API

Manifest requirements: The content script must be declared in manifest.json with URL match patterns, and the extension needs the tabs permission to use chrome.tabs.connect().

Usage

Use this when building extensions that need to process the content of the currently active web page for LLM inference.

When to apply:

Building a "summarize this page" feature
Implementing question-answering about the current page content
Creating context-aware chat that references what the user is reading
Any extension feature that combines page DOM content with LLM inference

When not to apply:

Extensions that only need user-typed input (no page context)
Extensions that access page content via other means (e.g., reading from clipboard)
Background-only extensions with no user-facing UI

Implementation checklist:

Declare content_scripts in manifest.json with appropriate matches patterns
Add "tabs" to the permissions array in the manifest
Create a content script that listens for port connections and responds with DOM text
In the popup script, use chrome.tabs.connect() to request page content
Inject the received text as context in the LLM prompt

Theoretical Basis

Chrome Extensions use a multi-context security model:

Extension pages (popup, options, background) run in the extension's own origin and can access Chrome APIs
Content scripts run in the web page's DOM context but in an isolated JavaScript world
Web pages run in their own context with no direct access to extension APIs

Communication between these contexts uses Chrome's message passing APIs:

chrome.runtime.connect() / chrome.runtime.onConnect for long-lived port connections
chrome.runtime.sendMessage() / chrome.runtime.onMessage for one-shot messages
chrome.tabs.connect() for popup-to-content-script port connections

The web-llm extensions use chrome.tabs.connect() for page content extraction because it establishes a port that can handle the asynchronous nature of DOM reading. The content script uses chrome.runtime.onConnect to listen for these connections.

The pattern of extracting page text and using it as LLM context is a simple form of Retrieval-Augmented Generation (RAG), where the "retrieval" step is replaced by direct DOM access. This approach has limitations (no chunking, no semantic search, full page text may exceed context window), but it is effective for small-to-medium pages.

I/O Contract

Content Script Input: An empty message {} received via port.onMessage.

Content Script Output: An object { contents: string } containing either:

document.body.innerHTML (service worker example - includes HTML markup)
document.body.innerText (non-service-worker example - plain text only)

Popup-to-Content-Script Communication:

Step	API Call	Direction
1. Query active tab	`chrome.tabs.query({ currentWindow: true, active: true }, callback)`	Popup -> Chrome API
2. Connect to tab	`chrome.tabs.connect(tabId, { name: "channelName" })`	Popup -> Content Script
3. Request content	`port.postMessage({})`	Popup -> Content Script
4. Receive content	`port.onMessage.addListener(callback)`	Content Script -> Popup

Context injection into LLM prompt:

Approach	Prompt Template
Direct context	Prepend page text before user message as a system or user message
RAG-style (from the non-service-worker example)	`"Use only the following context when answering the question at the end. Don't use any other knowledge.\n" + context + "\n\nQuestion: " + message + "\n\nHelpful Answer: "`

Usage Examples

Content script (content.js) - HTML extraction:

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerHTML });
  });
});

Content script (content.js) - Plain text extraction:

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerText });
  });
});

Popup script - Fetching page contents:

function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        console.log("Page contents:", msg.contents);
        // Use msg.contents as context for LLM inference
      });
    }
  });
}

// Fetch page contents when popup opens
window.onload = function () {
  fetchPageContents();
};

Using page content as LLM context (RAG-style prompt):

import {
  CreateExtensionServiceWorkerMLCEngine,
  ChatCompletionMessageParam,
} from "@mlc-ai/web-llm";

let pageContext = "";

// Fetch page content on load
function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        pageContext = msg.contents;
      });
    }
  });
}

// Create the engine
const engine = await CreateExtensionServiceWorkerMLCEngine(
  "Qwen2-0.5B-Instruct-q4f16_1-MLC",
  { initProgressCallback: (report) => console.log(report.text) },
);

// Build a context-aware prompt
async function askAboutPage(userQuestion: string) {
  let prompt = userQuestion;
  if (pageContext.length > 0) {
    prompt =
      "Use only the following context when answering the question at the end. " +
      "Don't use any other knowledge.\n" +
      pageContext +
      "\n\nQuestion: " +
      userQuestion +
      "\n\nHelpful Answer: ";
  }

  const chatHistory: ChatCompletionMessageParam[] = [
    { role: "user", content: prompt },
  ];

  const completion = await engine.chat.completions.create({
    stream: true,
    messages: chatHistory,
  });

  let response = "";
  for await (const chunk of completion) {
    const delta = chunk.choices[0].delta.content;
    if (delta) response += delta;
  }
  return response;
}

Manifest declaration for content scripts:

{
  "content_scripts": [
    {
      "matches": ["<all_urls>"],
      "js": ["content.js"]
    }
  ],
  "permissions": ["storage", "tabs", "webNavigation"]
}

Related Pages

Implementation:Mlc_ai_Web_llm_Chrome_Tabs_Connect
Mlc_ai_Web_llm_Chrome_Extension_Manifest - Manifest configuration where content scripts are declared
Mlc_ai_Web_llm_Extension_Client_Engine - The engine proxy in the popup that uses page content as LLM context
Mlc_ai_Web_llm_Extension_Service_Worker - The service worker that performs inference with the page context

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment