Principle:Mlc ai Web llm Page Content Access
Overview
Pattern for accessing web page content from a Chrome Extension to use as context for LLM inference. This enables use cases such as page summarization, question answering about page content, and context-aware AI assistance. The pattern uses Chrome's content scripts API to inject a script into the page DOM that extracts text content and sends it to the extension popup via chrome.runtime port-based messaging.
Description
Page content access in web-llm Chrome extensions follows a three-component message passing pattern:
1. Content Script (injected into web pages): A small JavaScript file declared in manifest.json under content_scripts. This script runs in the context of every matched web page and has access to the page's DOM. It listens for incoming port connections from the popup and responds with the page's text content.
2. Popup Script (initiator): The popup script uses chrome.tabs.connect() to establish a port connection to the content script running in the active tab. It sends an empty message to trigger the content script, which responds with the extracted page text.
3. LLM Context Injection: Once the popup has the page text, it can prepend it to the user's message as context for the LLM. The repository examples show two approaches:
- The service worker example (
chrome-extension-webgpu-service-worker) stores the page contents and logs them, with auseContextflag that is set tofalseby default - The non-service-worker example (
chrome-extension) stores the content in acontextvariable and uses it to construct a RAG-style prompt: "Use only the following context when answering the question..."
Security considerations: Content scripts run in an isolated world with access to the page DOM but not the page's JavaScript context. This means:
- They can read
document.body.innerTextordocument.body.innerHTML - They cannot access JavaScript variables or functions defined by the page
- Communication with the popup is restricted to the Chrome messaging API
Manifest requirements: The content script must be declared in manifest.json with URL match patterns, and the extension needs the tabs permission to use chrome.tabs.connect().
Usage
Use this when building extensions that need to process the content of the currently active web page for LLM inference.
When to apply:
- Building a "summarize this page" feature
- Implementing question-answering about the current page content
- Creating context-aware chat that references what the user is reading
- Any extension feature that combines page DOM content with LLM inference
When not to apply:
- Extensions that only need user-typed input (no page context)
- Extensions that access page content via other means (e.g., reading from clipboard)
- Background-only extensions with no user-facing UI
Implementation checklist:
- Declare
content_scriptsinmanifest.jsonwith appropriatematchespatterns - Add
"tabs"to thepermissionsarray in the manifest - Create a content script that listens for port connections and responds with DOM text
- In the popup script, use
chrome.tabs.connect()to request page content - Inject the received text as context in the LLM prompt
Theoretical Basis
Chrome Extensions use a multi-context security model:
- Extension pages (popup, options, background) run in the extension's own origin and can access Chrome APIs
- Content scripts run in the web page's DOM context but in an isolated JavaScript world
- Web pages run in their own context with no direct access to extension APIs
Communication between these contexts uses Chrome's message passing APIs:
chrome.runtime.connect()/chrome.runtime.onConnectfor long-lived port connectionschrome.runtime.sendMessage()/chrome.runtime.onMessagefor one-shot messageschrome.tabs.connect()for popup-to-content-script port connections
The web-llm extensions use chrome.tabs.connect() for page content extraction because it establishes a port that can handle the asynchronous nature of DOM reading. The content script uses chrome.runtime.onConnect to listen for these connections.
The pattern of extracting page text and using it as LLM context is a simple form of Retrieval-Augmented Generation (RAG), where the "retrieval" step is replaced by direct DOM access. This approach has limitations (no chunking, no semantic search, full page text may exceed context window), but it is effective for small-to-medium pages.
I/O Contract
Content Script Input: An empty message {} received via port.onMessage.
Content Script Output: An object { contents: string } containing either:
document.body.innerHTML(service worker example - includes HTML markup)document.body.innerText(non-service-worker example - plain text only)
Popup-to-Content-Script Communication:
| Step | API Call | Direction |
|---|---|---|
| 1. Query active tab | chrome.tabs.query({ currentWindow: true, active: true }, callback) |
Popup -> Chrome API |
| 2. Connect to tab | chrome.tabs.connect(tabId, { name: "channelName" }) |
Popup -> Content Script |
| 3. Request content | port.postMessage({}) |
Popup -> Content Script |
| 4. Receive content | port.onMessage.addListener(callback) |
Content Script -> Popup |
Context injection into LLM prompt:
| Approach | Prompt Template |
|---|---|
| Direct context | Prepend page text before user message as a system or user message |
| RAG-style (from the non-service-worker example) | "Use only the following context when answering the question at the end. Don't use any other knowledge.\n" + context + "\n\nQuestion: " + message + "\n\nHelpful Answer: "
|
Usage Examples
Content script (content.js) - HTML extraction:
// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
port.onMessage.addListener(function (msg) {
port.postMessage({ contents: document.body.innerHTML });
});
});
Content script (content.js) - Plain text extraction:
// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
port.onMessage.addListener(function (msg) {
port.postMessage({ contents: document.body.innerText });
});
});
Popup script - Fetching page contents:
function fetchPageContents() {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
if (tabs[0]?.id) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
console.log("Page contents:", msg.contents);
// Use msg.contents as context for LLM inference
});
}
});
}
// Fetch page contents when popup opens
window.onload = function () {
fetchPageContents();
};
Using page content as LLM context (RAG-style prompt):
import {
CreateExtensionServiceWorkerMLCEngine,
ChatCompletionMessageParam,
} from "@mlc-ai/web-llm";
let pageContext = "";
// Fetch page content on load
function fetchPageContents() {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
if (tabs[0]?.id) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
pageContext = msg.contents;
});
}
});
}
// Create the engine
const engine = await CreateExtensionServiceWorkerMLCEngine(
"Qwen2-0.5B-Instruct-q4f16_1-MLC",
{ initProgressCallback: (report) => console.log(report.text) },
);
// Build a context-aware prompt
async function askAboutPage(userQuestion: string) {
let prompt = userQuestion;
if (pageContext.length > 0) {
prompt =
"Use only the following context when answering the question at the end. " +
"Don't use any other knowledge.\n" +
pageContext +
"\n\nQuestion: " +
userQuestion +
"\n\nHelpful Answer: ";
}
const chatHistory: ChatCompletionMessageParam[] = [
{ role: "user", content: prompt },
];
const completion = await engine.chat.completions.create({
stream: true,
messages: chatHistory,
});
let response = "";
for await (const chunk of completion) {
const delta = chunk.choices[0].delta.content;
if (delta) response += delta;
}
return response;
}
Manifest declaration for content scripts:
{
"content_scripts": [
{
"matches": ["<all_urls>"],
"js": ["content.js"]
}
],
"permissions": ["storage", "tabs", "webNavigation"]
}
Related Pages
- Implementation:Mlc_ai_Web_llm_Chrome_Tabs_Connect
- Mlc_ai_Web_llm_Chrome_Extension_Manifest - Manifest configuration where content scripts are declared
- Mlc_ai_Web_llm_Extension_Client_Engine - The engine proxy in the popup that uses page content as LLM context
- Mlc_ai_Web_llm_Extension_Service_Worker - The service worker that performs inference with the page context