Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Mlc ai Web llm Page Content Access

From Leeroopedia

Template:Metadata

Overview

Pattern for accessing web page content from a Chrome Extension to use as context for LLM inference. This enables use cases such as page summarization, question answering about page content, and context-aware AI assistance. The pattern uses Chrome's content scripts API to inject a script into the page DOM that extracts text content and sends it to the extension popup via chrome.runtime port-based messaging.

Description

Page content access in web-llm Chrome extensions follows a three-component message passing pattern:

1. Content Script (injected into web pages): A small JavaScript file declared in manifest.json under content_scripts. This script runs in the context of every matched web page and has access to the page's DOM. It listens for incoming port connections from the popup and responds with the page's text content.

2. Popup Script (initiator): The popup script uses chrome.tabs.connect() to establish a port connection to the content script running in the active tab. It sends an empty message to trigger the content script, which responds with the extracted page text.

3. LLM Context Injection: Once the popup has the page text, it can prepend it to the user's message as context for the LLM. The repository examples show two approaches:

  • The service worker example (chrome-extension-webgpu-service-worker) stores the page contents and logs them, with a useContext flag that is set to false by default
  • The non-service-worker example (chrome-extension) stores the content in a context variable and uses it to construct a RAG-style prompt: "Use only the following context when answering the question..."

Security considerations: Content scripts run in an isolated world with access to the page DOM but not the page's JavaScript context. This means:

  • They can read document.body.innerText or document.body.innerHTML
  • They cannot access JavaScript variables or functions defined by the page
  • Communication with the popup is restricted to the Chrome messaging API

Manifest requirements: The content script must be declared in manifest.json with URL match patterns, and the extension needs the tabs permission to use chrome.tabs.connect().

Usage

Use this when building extensions that need to process the content of the currently active web page for LLM inference.

When to apply:

  • Building a "summarize this page" feature
  • Implementing question-answering about the current page content
  • Creating context-aware chat that references what the user is reading
  • Any extension feature that combines page DOM content with LLM inference

When not to apply:

  • Extensions that only need user-typed input (no page context)
  • Extensions that access page content via other means (e.g., reading from clipboard)
  • Background-only extensions with no user-facing UI

Implementation checklist:

  1. Declare content_scripts in manifest.json with appropriate matches patterns
  2. Add "tabs" to the permissions array in the manifest
  3. Create a content script that listens for port connections and responds with DOM text
  4. In the popup script, use chrome.tabs.connect() to request page content
  5. Inject the received text as context in the LLM prompt

Theoretical Basis

Chrome Extensions use a multi-context security model:

  • Extension pages (popup, options, background) run in the extension's own origin and can access Chrome APIs
  • Content scripts run in the web page's DOM context but in an isolated JavaScript world
  • Web pages run in their own context with no direct access to extension APIs

Communication between these contexts uses Chrome's message passing APIs:

  • chrome.runtime.connect() / chrome.runtime.onConnect for long-lived port connections
  • chrome.runtime.sendMessage() / chrome.runtime.onMessage for one-shot messages
  • chrome.tabs.connect() for popup-to-content-script port connections

The web-llm extensions use chrome.tabs.connect() for page content extraction because it establishes a port that can handle the asynchronous nature of DOM reading. The content script uses chrome.runtime.onConnect to listen for these connections.

The pattern of extracting page text and using it as LLM context is a simple form of Retrieval-Augmented Generation (RAG), where the "retrieval" step is replaced by direct DOM access. This approach has limitations (no chunking, no semantic search, full page text may exceed context window), but it is effective for small-to-medium pages.

I/O Contract

Content Script Input: An empty message {} received via port.onMessage.

Content Script Output: An object { contents: string } containing either:

  • document.body.innerHTML (service worker example - includes HTML markup)
  • document.body.innerText (non-service-worker example - plain text only)

Popup-to-Content-Script Communication:

Step API Call Direction
1. Query active tab chrome.tabs.query({ currentWindow: true, active: true }, callback) Popup -> Chrome API
2. Connect to tab chrome.tabs.connect(tabId, { name: "channelName" }) Popup -> Content Script
3. Request content port.postMessage({}) Popup -> Content Script
4. Receive content port.onMessage.addListener(callback) Content Script -> Popup

Context injection into LLM prompt:

Approach Prompt Template
Direct context Prepend page text before user message as a system or user message
RAG-style (from the non-service-worker example) "Use only the following context when answering the question at the end. Don't use any other knowledge.\n" + context + "\n\nQuestion: " + message + "\n\nHelpful Answer: "

Usage Examples

Content script (content.js) - HTML extraction:

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerHTML });
  });
});

Content script (content.js) - Plain text extraction:

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerText });
  });
});

Popup script - Fetching page contents:

function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        console.log("Page contents:", msg.contents);
        // Use msg.contents as context for LLM inference
      });
    }
  });
}

// Fetch page contents when popup opens
window.onload = function () {
  fetchPageContents();
};

Using page content as LLM context (RAG-style prompt):

import {
  CreateExtensionServiceWorkerMLCEngine,
  ChatCompletionMessageParam,
} from "@mlc-ai/web-llm";

let pageContext = "";

// Fetch page content on load
function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        pageContext = msg.contents;
      });
    }
  });
}

// Create the engine
const engine = await CreateExtensionServiceWorkerMLCEngine(
  "Qwen2-0.5B-Instruct-q4f16_1-MLC",
  { initProgressCallback: (report) => console.log(report.text) },
);

// Build a context-aware prompt
async function askAboutPage(userQuestion: string) {
  let prompt = userQuestion;
  if (pageContext.length > 0) {
    prompt =
      "Use only the following context when answering the question at the end. " +
      "Don't use any other knowledge.\n" +
      pageContext +
      "\n\nQuestion: " +
      userQuestion +
      "\n\nHelpful Answer: ";
  }

  const chatHistory: ChatCompletionMessageParam[] = [
    { role: "user", content: prompt },
  ];

  const completion = await engine.chat.completions.create({
    stream: true,
    messages: chatHistory,
  });

  let response = "";
  for await (const chunk of completion) {
    const delta = chunk.choices[0].delta.content;
    if (delta) response += delta;
  }
  return response;
}

Manifest declaration for content scripts:

{
  "content_scripts": [
    {
      "matches": ["<all_urls>"],
      "js": ["content.js"]
    }
  ],
  "permissions": ["storage", "tabs", "webNavigation"]
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment