Implementation:Mlc ai Web llm Service Worker MLC Engine Handler
Overview
Source code implementation of the ServiceWorkerMLCEngineHandler class, which extends WebWorkerMLCEngineHandler to operate within a Chrome Extension service worker context. This class replaces the Web Worker postMessage API with chrome.runtime.Port communication and adds model caching logic to avoid redundant model reloads when the popup reconnects.
Description
ServiceWorkerMLCEngineHandler is defined in src/extension_service_worker.ts at lines 34-101. It inherits all message routing, task handling, and engine management from WebWorkerMLCEngineHandler (defined in src/web_worker.ts) and overrides only the communication and reload logic.
The class is exported under two names from the package:
ServiceWorkerMLCEngineHandler(canonical name fromsrc/extension_service_worker.ts)ExtensionServiceWorkerMLCEngineHandler(alias from the package index for backward compatibility)
Code Reference
Source: src/extension_service_worker.ts, Lines 34-101
Class Signature
export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
port: chrome.runtime.Port | null;
constructor(port: chrome.runtime.Port);
postMessage(msg: any): void;
setPort(port: chrome.runtime.Port): void;
onPortDisconnect(port: chrome.runtime.Port): void;
onmessage(event: any): void;
}
Import
// Canonical import
import { ServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
// Backward-compatible alias
import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
Full Implementation
export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
port: chrome.runtime.Port | null;
constructor(port: chrome.runtime.Port) {
super();
this.port = port;
port.onDisconnect.addListener(() => this.onPortDisconnect(port));
}
postMessage(msg: any) {
this.port?.postMessage(msg);
}
setPort(port: chrome.runtime.Port) {
this.port = port;
port.onDisconnect.addListener(() => this.onPortDisconnect(port));
}
onPortDisconnect(port: chrome.runtime.Port) {
if (port === this.port) {
this.port = null;
}
}
onmessage(event: any): void {
if (event.type === "keepAlive") {
return;
}
const msg = event as WorkerRequest;
if (msg.kind === "reload") {
this.handleTask(msg.uuid, async () => {
const params = msg.content as ReloadParams;
// If the modelId, chatOpts, and appConfig are the same, immediately return
if (
areArraysEqual(this.modelId, params.modelId) &&
areChatOptionsListEqual(this.chatOpts, params.chatOpts)
) {
log.info("Already loaded the model. Skip loading");
const gpuDetectOutput = await tvmjs.detectGPUDevice();
if (gpuDetectOutput == undefined) {
throw new WebGPUNotFoundError();
}
let gpuLabel = "WebGPU";
if (gpuDetectOutput.adapterInfo.description.length != 0) {
gpuLabel += " - " + gpuDetectOutput.adapterInfo.description;
} else {
gpuLabel += " - " + gpuDetectOutput.adapterInfo.vendor;
}
this.engine.getInitProgressCallback()?.({
progress: 1,
timeElapsed: 0,
text: "Finish loading on " + gpuLabel,
});
return null;
}
await this.engine.reload(params.modelId, params.chatOpts);
this.modelId = params.modelId;
this.chatOpts = params.chatOpts;
return null;
});
return;
}
// All rest of message handling are the same as WebWorkerMLCEngineHandler
super.onmessage(event);
}
}
I/O Contract
Constructor
| Parameter | Type | Description |
|---|---|---|
port |
chrome.runtime.Port |
The port from chrome.runtime.onConnect; used for all bidirectional communication
|
The constructor calls super() which creates a new MLCEngine instance and sets up the initProgressCallback to forward progress messages via postMessage.
postMessage(msg)
| Parameter | Type | Description |
|---|---|---|
msg |
any |
A WorkerResponse message to send to the popup via the port
|
Overrides WebWorkerMLCEngineHandler.postMessage() which uses the global postMessage. Uses optional chaining (this.port?.postMessage) so messages are silently dropped if the port is disconnected.
setPort(port)
| Parameter | Type | Description |
|---|---|---|
port |
chrome.runtime.Port |
A new port from a reconnecting popup |
Updates the handler's port reference and registers a disconnect listener. Called when the popup reconnects to an already-initialized handler.
onmessage(event)
| Parameter | Type | Description |
|---|---|---|
event |
any |
Either a WorkerRequest object or a { type: "keepAlive" } heartbeat
|
Return behavior:
- For
keepAlivemessages: returns immediately (no-op) - For
reloadmessages with matching model: sendsinitProgressCallbackwithprogress: 1, then sendsreturnresponse - For
reloadmessages with different model: delegates toengine.reload(), sendsreturnresponse on completion - For all other message kinds: delegates to
super.onmessage(event)
Usage Examples
Standard background script setup:
import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
let handler;
chrome.runtime.onConnect.addListener(function (port) {
console.assert(port.name === "web_llm_service_worker");
if (handler === undefined) {
handler = new ExtensionServiceWorkerMLCEngineHandler(port);
} else {
handler.setPort(port);
}
port.onMessage.addListener(handler.onmessage.bind(handler));
});
Important: The .bind(handler) call is essential. Without it, this inside onmessage would refer to the port's listener context rather than the handler instance, causing this.engine and this.port to be undefined.
Background script with custom logit processor:
import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
let handler;
chrome.runtime.onConnect.addListener(function (port) {
if (handler === undefined) {
handler = new ExtensionServiceWorkerMLCEngineHandler(port);
// Register a custom logit processor for controlled generation
handler.setLogitProcessorRegistry(
new Map([
["myModel", myCustomLogitProcessor],
])
);
} else {
handler.setPort(port);
}
port.onMessage.addListener(handler.onmessage.bind(handler));
});
Understanding the reload caching flow:
Popup opens (first time):
1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
2. This calls chrome.runtime.connect() -> port created
3. background.ts receives port -> creates handler
4. popup sends reload message with modelId
5. handler.modelId is undefined -> performs full engine.reload()
6. Model downloads, compiles, loads into GPU memory (~30s)
7. handler.modelId set to ["Qwen2-0.5B-Instruct-q4f16_1-MLC"]
Popup closes and reopens (service worker still alive):
1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
2. New port created -> handler.setPort(port)
3. popup sends reload message with same modelId
4. handler.modelId matches -> skips reload
5. Immediately reports progress: 1 with GPU label
6. Model is ready instantly
Internal Dependencies
| Import | Source | Purpose |
|---|---|---|
WebWorkerMLCEngineHandler |
src/web_worker.ts |
Base class providing engine creation, message routing, and task handling |
tvmjs.detectGPUDevice |
@mlc-ai/web-runtime |
Used during skip-reload path to detect GPU and report label |
areArraysEqual |
src/utils.ts |
Compares model ID arrays to determine if reload can be skipped |
areChatOptionsListEqual |
src/utils.ts |
Compares chat options to determine if reload can be skipped |
WebGPUNotFoundError |
src/error.ts |
Thrown when WebGPU is unavailable during skip-reload GPU detection |
WorkerRequest, ReloadParams |
src/message.ts |
Message protocol types |
Related Pages
- Principle:Mlc_ai_Web_llm_Extension_Service_Worker
- Mlc_ai_Web_llm_Manifest_V3_Configuration - Manifest that registers the service worker
- Mlc_ai_Web_llm_Create_Service_Worker_MLC_Engine - Popup-side factory that connects to this handler
- Mlc_ai_Web_llm_Chrome_Extension_Manifest - Configuration principle for the extension manifest
- Environment:Mlc_ai_Web_llm_Chrome_Extension_Manifest_V3
- Heuristic:Mlc_ai_Web_llm_Service_Worker_Keep_Alive