Implementation:Mlc ai Web llm Chrome Tabs Connect
Overview
External tool documentation for the Chrome Extensions content script and messaging pattern used to extract web page content for LLM inference context. This implementation uses Chrome's chrome.tabs.connect() API, chrome.runtime.onConnect listener, and port-based messaging to transfer DOM text from web pages to the extension popup, where it can be injected into LLM prompts.
Description
This implementation documents three Chrome Extension APIs working together to enable page content extraction:
1. chrome.tabs.connect(tabId, connectInfo) - Called from the popup script to open a long-lived port connection to the content script running in the specified tab. Returns a chrome.runtime.Port.
2. chrome.runtime.onConnect - Listened to in the content script. Fires when the popup establishes a connection via chrome.tabs.connect(). Provides a chrome.runtime.Port for bidirectional messaging.
3. port.postMessage() / port.onMessage.addListener() - Used for the actual data exchange. The popup sends an empty trigger message; the content script responds with page text.
The repository provides two content script variants:
- Service worker example: Extracts
document.body.innerHTML(preserves HTML structure) - Non-service-worker example: Extracts
document.body.innerText(plain text only)
Code Reference
Content Script (Service Worker Example)
Source: examples/chrome-extension-webgpu-service-worker/src/content.js (full file)
// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
port.onMessage.addListener(function (msg) {
port.postMessage({ contents: document.body.innerHTML });
});
});
Content Script (Non-Service-Worker Example)
Source: examples/chrome-extension/src/content.js (full file)
// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
port.onMessage.addListener(function (msg) {
port.postMessage({ contents: document.body.innerText });
});
});
Popup Script - fetchPageContents (Service Worker Example)
Source: examples/chrome-extension-webgpu-service-worker/src/popup.ts, Lines 149-160
function fetchPageContents() {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
if (tabs[0]?.id) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
console.log("Page contents:", msg.contents);
chrome.runtime.sendMessage({ context: msg.contents });
});
}
});
}
Popup Script - fetchPageContents (Non-Service-Worker Example)
Source: examples/chrome-extension/src/popup.ts, Lines 289-298
function fetchPageContents() {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
console.log("Page contents:", msg.contents);
context = msg.contents;
});
});
}
Context Injection into LLM Prompt (Non-Service-Worker Example)
Source: examples/chrome-extension/src/popup.ts, Lines 160-168
// Inside handleClick():
let inp = message;
if (context.length > 0) {
inp =
"Use only the following context when answering the question at the end. Don't use any other knowledge.\n" +
context +
"\n\nQuestion: " +
message +
"\n\nHelpful Answer: ";
}
chatHistory.push({ role: "user", content: inp });
I/O Contract
chrome.tabs.query()
| Parameter | Type | Description |
|---|---|---|
queryInfo |
{ currentWindow: true, active: true } |
Selects the currently active tab in the current window |
Returns: Callback receives Tab[] where tabs[0].id is the active tab ID.
chrome.tabs.connect()
| Parameter | Type | Description |
|---|---|---|
tabId |
number |
The ID of the tab to connect to (from tabs[0].id)
|
connectInfo |
{ name: string } |
Port name identifier (e.g. "channelName")
|
Returns: chrome.runtime.Port - a bidirectional communication channel with the content script.
Content Script Message Protocol
| Direction | Message Format | Description |
|---|---|---|
| Popup -> Content Script | {} (empty object) |
Trigger message requesting page content |
| Content Script -> Popup | { contents: string } |
Page text content (HTML or plain text) |
Manifest Declaration
| Field | Value | Description |
|---|---|---|
content_scripts[].matches |
["<all_urls>"] |
URL patterns where the content script is injected |
content_scripts[].js |
["content.js"] |
Path to the content script file |
permissions |
Must include "tabs" |
Required for chrome.tabs.connect()
|
Usage Examples
Complete content script with error handling:
// content.js - Injected into web pages by Chrome
chrome.runtime.onConnect.addListener(function (port) {
if (port.name === "channelName") {
port.onMessage.addListener(function (msg) {
try {
// Extract plain text (preferred for LLM context)
const pageText = document.body.innerText;
port.postMessage({ contents: pageText });
} catch (error) {
port.postMessage({ contents: "", error: error.message });
}
});
}
});
Popup script with conditional context usage:
// Whether or not to use the content from the active tab as the context
const useContext = false;
let pageContext = "";
function fetchPageContents() {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
if (tabs[0]?.id) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
console.log("Page contents:", msg.contents);
pageContext = msg.contents;
});
}
});
}
// Grab the page contents when the popup is opened
window.onload = function () {
if (useContext) {
fetchPageContents();
}
};
Complete example: page summarization with web-llm:
import {
CreateExtensionServiceWorkerMLCEngine,
ChatCompletionMessageParam,
} from "@mlc-ai/web-llm";
// Step 1: Create engine
const engine = await CreateExtensionServiceWorkerMLCEngine(
"Qwen2-0.5B-Instruct-q4f16_1-MLC",
{ initProgressCallback: (r) => console.log(r.text) },
);
// Step 2: Fetch page content
function getPageContent(): Promise<string> {
return new Promise((resolve) => {
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
if (tabs[0]?.id) {
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
port.postMessage({});
port.onMessage.addListener(function (msg) {
resolve(msg.contents);
});
} else {
resolve("");
}
});
});
}
// Step 3: Summarize the page
async function summarizePage() {
const pageContent = await getPageContent();
if (!pageContent) {
console.log("No page content available");
return;
}
// Truncate if too long for the model's context window
const truncated = pageContent.substring(0, 4000);
const messages: ChatCompletionMessageParam[] = [
{
role: "system",
content: "You are a helpful assistant that summarizes web pages concisely.",
},
{
role: "user",
content: "Please summarize the following web page content:\n\n" + truncated,
},
];
const completion = await engine.chat.completions.create({
stream: true,
messages: messages,
});
let summary = "";
for await (const chunk of completion) {
const delta = chunk.choices[0].delta.content;
if (delta) summary += delta;
}
console.log("Summary:", summary);
}
External Dependencies
| API | Chrome Version | Documentation |
|---|---|---|
chrome.tabs.connect() |
Chrome 26+ | chrome.tabs.connect |
chrome.tabs.query() |
Chrome 16+ | chrome.tabs.query |
chrome.runtime.onConnect |
Chrome 26+ | chrome.runtime.onConnect |
chrome.runtime.Port |
Chrome 26+ | chrome.runtime.Port |
| Content Scripts API | Chrome 88+ (MV3) | Content Scripts |
Known Limitations
- No chunking: The content script sends the entire page text in one message. For very large pages, this may exceed message size limits or the model's context window.
- No filtering: The raw
innerTextorinnerHTMLincludes navigation, headers, footers, and other non-content elements. A production extension would benefit from content extraction heuristics. - Runtime errors on special pages: Content scripts cannot be injected into
chrome://pages,chrome-extension://pages, or the Chrome Web Store. Attemptingchrome.tabs.connect()on these pages throwsruntime.lastError. - Timing dependency: If the popup calls
fetchPageContents()before the content script has loaded in the tab (e.g., on a freshly navigated page), the connection may fail silently.
Related Pages
- Principle:Mlc_ai_Web_llm_Page_Content_Access
- Mlc_ai_Web_llm_Manifest_V3_Configuration - Manifest where content scripts and permissions are declared
- Mlc_ai_Web_llm_Create_Service_Worker_MLC_Engine - Engine factory used in the popup alongside page content extraction
- Mlc_ai_Web_llm_Chrome_Extension_Manifest - Principle for manifest configuration including content script declaration
- Environment:Mlc_ai_Web_llm_Chrome_Extension_Manifest_V3