Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Web llm Chrome Tabs Connect

From Leeroopedia

Template:Metadata

Overview

External tool documentation for the Chrome Extensions content script and messaging pattern used to extract web page content for LLM inference context. This implementation uses Chrome's chrome.tabs.connect() API, chrome.runtime.onConnect listener, and port-based messaging to transfer DOM text from web pages to the extension popup, where it can be injected into LLM prompts.

Description

This implementation documents three Chrome Extension APIs working together to enable page content extraction:

1. chrome.tabs.connect(tabId, connectInfo) - Called from the popup script to open a long-lived port connection to the content script running in the specified tab. Returns a chrome.runtime.Port.

2. chrome.runtime.onConnect - Listened to in the content script. Fires when the popup establishes a connection via chrome.tabs.connect(). Provides a chrome.runtime.Port for bidirectional messaging.

3. port.postMessage() / port.onMessage.addListener() - Used for the actual data exchange. The popup sends an empty trigger message; the content script responds with page text.

The repository provides two content script variants:

  • Service worker example: Extracts document.body.innerHTML (preserves HTML structure)
  • Non-service-worker example: Extracts document.body.innerText (plain text only)

Code Reference

Content Script (Service Worker Example)

Source: examples/chrome-extension-webgpu-service-worker/src/content.js (full file)

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerHTML });
  });
});

Content Script (Non-Service-Worker Example)

Source: examples/chrome-extension/src/content.js (full file)

// Only the content script is able to access the DOM
chrome.runtime.onConnect.addListener(function (port) {
  port.onMessage.addListener(function (msg) {
    port.postMessage({ contents: document.body.innerText });
  });
});

Popup Script - fetchPageContents (Service Worker Example)

Source: examples/chrome-extension-webgpu-service-worker/src/popup.ts, Lines 149-160

function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        console.log("Page contents:", msg.contents);
        chrome.runtime.sendMessage({ context: msg.contents });
      });
    }
  });
}

Popup Script - fetchPageContents (Non-Service-Worker Example)

Source: examples/chrome-extension/src/popup.ts, Lines 289-298

function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
    port.postMessage({});
    port.onMessage.addListener(function (msg) {
      console.log("Page contents:", msg.contents);
      context = msg.contents;
    });
  });
}

Context Injection into LLM Prompt (Non-Service-Worker Example)

Source: examples/chrome-extension/src/popup.ts, Lines 160-168

// Inside handleClick():
let inp = message;
if (context.length > 0) {
  inp =
    "Use only the following context when answering the question at the end. Don't use any other knowledge.\n" +
    context +
    "\n\nQuestion: " +
    message +
    "\n\nHelpful Answer: ";
}
chatHistory.push({ role: "user", content: inp });

I/O Contract

chrome.tabs.query()

Parameter Type Description
queryInfo { currentWindow: true, active: true } Selects the currently active tab in the current window

Returns: Callback receives Tab[] where tabs[0].id is the active tab ID.

chrome.tabs.connect()

Parameter Type Description
tabId number The ID of the tab to connect to (from tabs[0].id)
connectInfo { name: string } Port name identifier (e.g. "channelName")

Returns: chrome.runtime.Port - a bidirectional communication channel with the content script.

Content Script Message Protocol

Direction Message Format Description
Popup -> Content Script {} (empty object) Trigger message requesting page content
Content Script -> Popup { contents: string } Page text content (HTML or plain text)

Manifest Declaration

Field Value Description
content_scripts[].matches ["<all_urls>"] URL patterns where the content script is injected
content_scripts[].js ["content.js"] Path to the content script file
permissions Must include "tabs" Required for chrome.tabs.connect()

Usage Examples

Complete content script with error handling:

// content.js - Injected into web pages by Chrome
chrome.runtime.onConnect.addListener(function (port) {
  if (port.name === "channelName") {
    port.onMessage.addListener(function (msg) {
      try {
        // Extract plain text (preferred for LLM context)
        const pageText = document.body.innerText;
        port.postMessage({ contents: pageText });
      } catch (error) {
        port.postMessage({ contents: "", error: error.message });
      }
    });
  }
});

Popup script with conditional context usage:

// Whether or not to use the content from the active tab as the context
const useContext = false;

let pageContext = "";

function fetchPageContents() {
  chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
    if (tabs[0]?.id) {
      const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
      port.postMessage({});
      port.onMessage.addListener(function (msg) {
        console.log("Page contents:", msg.contents);
        pageContext = msg.contents;
      });
    }
  });
}

// Grab the page contents when the popup is opened
window.onload = function () {
  if (useContext) {
    fetchPageContents();
  }
};

Complete example: page summarization with web-llm:

import {
  CreateExtensionServiceWorkerMLCEngine,
  ChatCompletionMessageParam,
} from "@mlc-ai/web-llm";

// Step 1: Create engine
const engine = await CreateExtensionServiceWorkerMLCEngine(
  "Qwen2-0.5B-Instruct-q4f16_1-MLC",
  { initProgressCallback: (r) => console.log(r.text) },
);

// Step 2: Fetch page content
function getPageContent(): Promise<string> {
  return new Promise((resolve) => {
    chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
      if (tabs[0]?.id) {
        const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
        port.postMessage({});
        port.onMessage.addListener(function (msg) {
          resolve(msg.contents);
        });
      } else {
        resolve("");
      }
    });
  });
}

// Step 3: Summarize the page
async function summarizePage() {
  const pageContent = await getPageContent();
  if (!pageContent) {
    console.log("No page content available");
    return;
  }

  // Truncate if too long for the model's context window
  const truncated = pageContent.substring(0, 4000);

  const messages: ChatCompletionMessageParam[] = [
    {
      role: "system",
      content: "You are a helpful assistant that summarizes web pages concisely.",
    },
    {
      role: "user",
      content: "Please summarize the following web page content:\n\n" + truncated,
    },
  ];

  const completion = await engine.chat.completions.create({
    stream: true,
    messages: messages,
  });

  let summary = "";
  for await (const chunk of completion) {
    const delta = chunk.choices[0].delta.content;
    if (delta) summary += delta;
  }
  console.log("Summary:", summary);
}

External Dependencies

API Chrome Version Documentation
chrome.tabs.connect() Chrome 26+ chrome.tabs.connect
chrome.tabs.query() Chrome 16+ chrome.tabs.query
chrome.runtime.onConnect Chrome 26+ chrome.runtime.onConnect
chrome.runtime.Port Chrome 26+ chrome.runtime.Port
Content Scripts API Chrome 88+ (MV3) Content Scripts

Known Limitations

  • No chunking: The content script sends the entire page text in one message. For very large pages, this may exceed message size limits or the model's context window.
  • No filtering: The raw innerText or innerHTML includes navigation, headers, footers, and other non-content elements. A production extension would benefit from content extraction heuristics.
  • Runtime errors on special pages: Content scripts cannot be injected into chrome:// pages, chrome-extension:// pages, or the Chrome Web Store. Attempting chrome.tabs.connect() on these pages throws runtime.lastError.
  • Timing dependency: If the popup calls fetchPageContents() before the content script has loaded in the tab (e.g., on a freshly navigated page), the connection may fail silently.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment