Implementation:Mlc ai Web llm Create Service Worker MLC Engine
Overview
Source code implementation of the CreateServiceWorkerMLCEngine factory function and its supporting classes (ServiceWorkerMLCEngine, PortAdapter, ExtensionMLCEngineConfig). These components create a popup-side engine proxy that connects to the background service worker via chrome.runtime.Port, implements the full MLCEngineInterface, and keeps the service worker alive with periodic heartbeat messages.
Description
The implementation spans lines 13-16 (config interface) and 118-195 (factory, adapter, and engine class) of src/extension_service_worker.ts. It consists of four parts:
ExtensionMLCEngineConfig- Interface extendingMLCEngineConfigwith extension-specific fieldsCreateServiceWorkerMLCEngine- Async factory function (the primary public API)PortAdapter- Internal adapter class bridgingchrome.runtime.PorttoChatWorkerServiceWorkerMLCEngine- Engine class extendingWebWorkerMLCEngine
The library re-exports these under alias names from the package index:
CreateServiceWorkerMLCEngineis aliased asCreateExtensionServiceWorkerMLCEngineServiceWorkerMLCEngineis aliased asExtensionServiceWorkerMLCEngine
Code Reference
Source: src/extension_service_worker.ts, Lines 13-16 and 118-195
ExtensionMLCEngineConfig
export interface ExtensionMLCEngineConfig extends MLCEngineConfig {
extensionId?: string;
onDisconnect?: () => void;
}
CreateServiceWorkerMLCEngine (Factory Function)
export async function CreateServiceWorkerMLCEngine(
modelId: string | string[],
engineConfig?: ExtensionMLCEngineConfig,
chatOpts?: ChatOptions | ChatOptions[],
keepAliveMs = 10000,
): Promise<ServiceWorkerMLCEngine> {
const serviceWorkerMLCEngine = new ServiceWorkerMLCEngine(
engineConfig,
keepAliveMs,
);
await serviceWorkerMLCEngine.reload(modelId, chatOpts);
return serviceWorkerMLCEngine;
}
PortAdapter (Internal Class)
class PortAdapter implements ChatWorker {
port: chrome.runtime.Port;
private _onmessage!: (message: any) => void;
constructor(port: chrome.runtime.Port) {
this.port = port;
this.port.onMessage.addListener(this.handleMessage.bind(this));
}
// Wrapper to handle incoming messages and delegate to onmessage if available
private handleMessage(message: any) {
if (this._onmessage) {
this._onmessage(message);
}
}
// Getter and setter for onmessage to manage adding/removing listeners
get onmessage(): (message: any) => void {
return this._onmessage;
}
set onmessage(listener: (message: any) => void) {
this._onmessage = listener;
}
// Wrap port.postMessage to maintain 'this' context
postMessage = (message: any): void => {
this.port.postMessage(message);
};
}
ServiceWorkerMLCEngine (Client Engine)
export class ServiceWorkerMLCEngine extends WebWorkerMLCEngine {
port: chrome.runtime.Port;
extensionId?: string;
constructor(engineConfig?: ExtensionMLCEngineConfig, keepAliveMs = 10000) {
const extensionId = engineConfig?.extensionId;
const onDisconnect = engineConfig?.onDisconnect;
const port = extensionId
? chrome.runtime.connect(extensionId, {
name: "web_llm_service_worker",
})
: chrome.runtime.connect({ name: "web_llm_service_worker" });
const chatWorker = new PortAdapter(port);
super(chatWorker, engineConfig);
this.port = port;
this.extensionId = extensionId;
// Keep alive through periodical heartbeat signals
const keepAliveTimer = setInterval(() => {
this.worker.postMessage({ kind: "keepAlive" });
}, keepAliveMs);
port.onDisconnect.addListener(() => {
clearInterval(keepAliveTimer);
if (onDisconnect) {
onDisconnect();
}
});
}
}
Import
// Canonical imports
import {
CreateServiceWorkerMLCEngine,
ServiceWorkerMLCEngine,
ExtensionMLCEngineConfig,
} from "@mlc-ai/web-llm";
// Backward-compatible aliases
import {
CreateExtensionServiceWorkerMLCEngine,
ExtensionServiceWorkerMLCEngine,
} from "@mlc-ai/web-llm";
I/O Contract
CreateServiceWorkerMLCEngine
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
modelId |
string[] | Yes | - | Model ID(s) to load. Must be in prebuiltAppConfig or engineConfig.appConfig
|
engineConfig |
ExtensionMLCEngineConfig |
No | undefined |
Engine configuration with optional extensionId and onDisconnect
|
chatOpts |
ChatOptions[] | No | undefined |
Overrides for mlc-chat-config.json; array size must match modelId if both are arrays
|
keepAliveMs |
number |
No | 10000 |
Heartbeat interval in ms to keep service worker alive |
Returns: Promise<ServiceWorkerMLCEngine> - resolves after the model is loaded (or skip-loaded if cached in the service worker).
ExtensionMLCEngineConfig
| Field | Type | Inherited From | Description |
|---|---|---|---|
initProgressCallback |
(report: InitProgressReport) => void |
MLCEngineConfig |
Called during model loading with progress updates |
appConfig |
AppConfig |
MLCEngineConfig |
Custom app configuration with model list |
logLevel |
LogLevel |
MLCEngineConfig |
Logging verbosity |
extensionId |
string |
Extension-specific | Target extension ID for cross-extension connections |
onDisconnect |
() => void |
Extension-specific | Callback when port disconnects |
ServiceWorkerMLCEngine API Surface
Inherits the full MLCEngineInterface from WebWorkerMLCEngine:
| Method | Description |
|---|---|
chat.completions.create(request) |
OpenAI-compatible chat completion (streaming and non-streaming) |
completions.create(request) |
Text completion (streaming and non-streaming) |
embeddings.create(request) |
Text embeddings |
reload(modelId, chatOpts) |
Load a model (skipped if already loaded in service worker) |
unload() |
Unload the current model from the service worker |
resetChat(keepStats, modelId) |
Reset chat state |
interruptGenerate() |
Interrupt ongoing generation |
getMessage(modelId) |
Get the last generated message |
runtimeStatsText(modelId) |
Get runtime performance statistics |
Usage Examples
Complete popup script (from the repository example):
import {
ChatCompletionMessageParam,
CreateExtensionServiceWorkerMLCEngine,
MLCEngineInterface,
InitProgressReport,
} from "@mlc-ai/web-llm";
// Set up progress callback for the loading bar
const initProgressCallback = (report: InitProgressReport) => {
progressBar.animate(report.progress, { duration: 50 });
if (report.progress == 1.0) {
enableInputs();
}
};
// Create the engine - connects to the background service worker
const engine: MLCEngineInterface = await CreateExtensionServiceWorkerMLCEngine(
"Qwen2-0.5B-Instruct-q4f16_1-MLC",
{ initProgressCallback: initProgressCallback },
);
// Use the engine for streaming chat completion
const chatHistory: ChatCompletionMessageParam[] = [];
chatHistory.push({ role: "user", content: "Hello!" });
let curMessage = "";
const completion = await engine.chat.completions.create({
stream: true,
messages: chatHistory,
});
for await (const chunk of completion) {
const curDelta = chunk.choices[0].delta.content;
if (curDelta) {
curMessage += curDelta;
}
updateUI(curMessage);
}
chatHistory.push({ role: "assistant", content: await engine.getMessage() });
Non-streaming chat completion:
const engine = await CreateExtensionServiceWorkerMLCEngine(
"Qwen2-0.5B-Instruct-q4f16_1-MLC",
{ initProgressCallback: (report) => console.log(report.text) },
);
const result = await engine.chat.completions.create({
stream: false,
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Summarize quantum computing in one sentence." },
],
});
console.log(result.choices[0].message.content);
Internal Architecture
The data flow through the component layers:
Popup Script
|
v
CreateExtensionServiceWorkerMLCEngine()
|
v
ServiceWorkerMLCEngine (extends WebWorkerMLCEngine)
| - Manages keepAlive timer
| - Holds chrome.runtime.Port reference
|
v
PortAdapter (implements ChatWorker)
| - Adapts chrome.runtime.Port to Worker-like interface
| - Maps port.onMessage -> onmessage setter
| - Maps postMessage -> port.postMessage
|
v
chrome.runtime.Port
| - Long-lived connection to background service worker
| - Name: "web_llm_service_worker"
|
v
ServiceWorkerMLCEngineHandler (in background.ts)
| - Receives messages via port.onMessage
| - Routes to MLCEngine methods
| - Caches loaded model state
|
v
MLCEngine (actual inference engine)
| - WebGPU/WASM model execution
| - TVM runtime
Related Pages
- Principle:Mlc_ai_Web_llm_Extension_Client_Engine
- Mlc_ai_Web_llm_Service_Worker_MLC_Engine_Handler - The service worker handler this engine connects to
- Mlc_ai_Web_llm_Manifest_V3_Configuration - Manifest configuration required for port connections
- Mlc_ai_Web_llm_Chrome_Tabs_Connect - Content script for page content that can be used as inference context
- Environment:Mlc_ai_Web_llm_Chrome_Extension_Manifest_V3
- Heuristic:Mlc_ai_Web_llm_Service_Worker_Keep_Alive