Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Web llm Create Service Worker MLC Engine

From Leeroopedia

Template:Metadata

Overview

Source code implementation of the CreateServiceWorkerMLCEngine factory function and its supporting classes (ServiceWorkerMLCEngine, PortAdapter, ExtensionMLCEngineConfig). These components create a popup-side engine proxy that connects to the background service worker via chrome.runtime.Port, implements the full MLCEngineInterface, and keeps the service worker alive with periodic heartbeat messages.

Description

The implementation spans lines 13-16 (config interface) and 118-195 (factory, adapter, and engine class) of src/extension_service_worker.ts. It consists of four parts:

  1. ExtensionMLCEngineConfig - Interface extending MLCEngineConfig with extension-specific fields
  2. CreateServiceWorkerMLCEngine - Async factory function (the primary public API)
  3. PortAdapter - Internal adapter class bridging chrome.runtime.Port to ChatWorker
  4. ServiceWorkerMLCEngine - Engine class extending WebWorkerMLCEngine

The library re-exports these under alias names from the package index:

  • CreateServiceWorkerMLCEngine is aliased as CreateExtensionServiceWorkerMLCEngine
  • ServiceWorkerMLCEngine is aliased as ExtensionServiceWorkerMLCEngine

Code Reference

Source: src/extension_service_worker.ts, Lines 13-16 and 118-195

ExtensionMLCEngineConfig

export interface ExtensionMLCEngineConfig extends MLCEngineConfig {
  extensionId?: string;
  onDisconnect?: () => void;
}

CreateServiceWorkerMLCEngine (Factory Function)

export async function CreateServiceWorkerMLCEngine(
  modelId: string | string[],
  engineConfig?: ExtensionMLCEngineConfig,
  chatOpts?: ChatOptions | ChatOptions[],
  keepAliveMs = 10000,
): Promise<ServiceWorkerMLCEngine> {
  const serviceWorkerMLCEngine = new ServiceWorkerMLCEngine(
    engineConfig,
    keepAliveMs,
  );
  await serviceWorkerMLCEngine.reload(modelId, chatOpts);
  return serviceWorkerMLCEngine;
}

PortAdapter (Internal Class)

class PortAdapter implements ChatWorker {
  port: chrome.runtime.Port;
  private _onmessage!: (message: any) => void;

  constructor(port: chrome.runtime.Port) {
    this.port = port;
    this.port.onMessage.addListener(this.handleMessage.bind(this));
  }

  // Wrapper to handle incoming messages and delegate to onmessage if available
  private handleMessage(message: any) {
    if (this._onmessage) {
      this._onmessage(message);
    }
  }

  // Getter and setter for onmessage to manage adding/removing listeners
  get onmessage(): (message: any) => void {
    return this._onmessage;
  }

  set onmessage(listener: (message: any) => void) {
    this._onmessage = listener;
  }

  // Wrap port.postMessage to maintain 'this' context
  postMessage = (message: any): void => {
    this.port.postMessage(message);
  };
}

ServiceWorkerMLCEngine (Client Engine)

export class ServiceWorkerMLCEngine extends WebWorkerMLCEngine {
  port: chrome.runtime.Port;
  extensionId?: string;

  constructor(engineConfig?: ExtensionMLCEngineConfig, keepAliveMs = 10000) {
    const extensionId = engineConfig?.extensionId;
    const onDisconnect = engineConfig?.onDisconnect;
    const port = extensionId
      ? chrome.runtime.connect(extensionId, {
          name: "web_llm_service_worker",
        })
      : chrome.runtime.connect({ name: "web_llm_service_worker" });
    const chatWorker = new PortAdapter(port);
    super(chatWorker, engineConfig);
    this.port = port;
    this.extensionId = extensionId;

    // Keep alive through periodical heartbeat signals
    const keepAliveTimer = setInterval(() => {
      this.worker.postMessage({ kind: "keepAlive" });
    }, keepAliveMs);

    port.onDisconnect.addListener(() => {
      clearInterval(keepAliveTimer);
      if (onDisconnect) {
        onDisconnect();
      }
    });
  }
}

Import

// Canonical imports
import {
  CreateServiceWorkerMLCEngine,
  ServiceWorkerMLCEngine,
  ExtensionMLCEngineConfig,
} from "@mlc-ai/web-llm";

// Backward-compatible aliases
import {
  CreateExtensionServiceWorkerMLCEngine,
  ExtensionServiceWorkerMLCEngine,
} from "@mlc-ai/web-llm";

I/O Contract

CreateServiceWorkerMLCEngine

Parameter Type Required Default Description
modelId string[] Yes - Model ID(s) to load. Must be in prebuiltAppConfig or engineConfig.appConfig
engineConfig ExtensionMLCEngineConfig No undefined Engine configuration with optional extensionId and onDisconnect
chatOpts ChatOptions[] No undefined Overrides for mlc-chat-config.json; array size must match modelId if both are arrays
keepAliveMs number No 10000 Heartbeat interval in ms to keep service worker alive

Returns: Promise<ServiceWorkerMLCEngine> - resolves after the model is loaded (or skip-loaded if cached in the service worker).

ExtensionMLCEngineConfig

Field Type Inherited From Description
initProgressCallback (report: InitProgressReport) => void MLCEngineConfig Called during model loading with progress updates
appConfig AppConfig MLCEngineConfig Custom app configuration with model list
logLevel LogLevel MLCEngineConfig Logging verbosity
extensionId string Extension-specific Target extension ID for cross-extension connections
onDisconnect () => void Extension-specific Callback when port disconnects

ServiceWorkerMLCEngine API Surface

Inherits the full MLCEngineInterface from WebWorkerMLCEngine:

Method Description
chat.completions.create(request) OpenAI-compatible chat completion (streaming and non-streaming)
completions.create(request) Text completion (streaming and non-streaming)
embeddings.create(request) Text embeddings
reload(modelId, chatOpts) Load a model (skipped if already loaded in service worker)
unload() Unload the current model from the service worker
resetChat(keepStats, modelId) Reset chat state
interruptGenerate() Interrupt ongoing generation
getMessage(modelId) Get the last generated message
runtimeStatsText(modelId) Get runtime performance statistics

Usage Examples

Complete popup script (from the repository example):

import {
  ChatCompletionMessageParam,
  CreateExtensionServiceWorkerMLCEngine,
  MLCEngineInterface,
  InitProgressReport,
} from "@mlc-ai/web-llm";

// Set up progress callback for the loading bar
const initProgressCallback = (report: InitProgressReport) => {
  progressBar.animate(report.progress, { duration: 50 });
  if (report.progress == 1.0) {
    enableInputs();
  }
};

// Create the engine - connects to the background service worker
const engine: MLCEngineInterface = await CreateExtensionServiceWorkerMLCEngine(
  "Qwen2-0.5B-Instruct-q4f16_1-MLC",
  { initProgressCallback: initProgressCallback },
);

// Use the engine for streaming chat completion
const chatHistory: ChatCompletionMessageParam[] = [];
chatHistory.push({ role: "user", content: "Hello!" });

let curMessage = "";
const completion = await engine.chat.completions.create({
  stream: true,
  messages: chatHistory,
});

for await (const chunk of completion) {
  const curDelta = chunk.choices[0].delta.content;
  if (curDelta) {
    curMessage += curDelta;
  }
  updateUI(curMessage);
}
chatHistory.push({ role: "assistant", content: await engine.getMessage() });

Non-streaming chat completion:

const engine = await CreateExtensionServiceWorkerMLCEngine(
  "Qwen2-0.5B-Instruct-q4f16_1-MLC",
  { initProgressCallback: (report) => console.log(report.text) },
);

const result = await engine.chat.completions.create({
  stream: false,
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Summarize quantum computing in one sentence." },
  ],
});

console.log(result.choices[0].message.content);

Internal Architecture

The data flow through the component layers:

Popup Script
    |
    v
CreateExtensionServiceWorkerMLCEngine()
    |
    v
ServiceWorkerMLCEngine (extends WebWorkerMLCEngine)
    |  - Manages keepAlive timer
    |  - Holds chrome.runtime.Port reference
    |
    v
PortAdapter (implements ChatWorker)
    |  - Adapts chrome.runtime.Port to Worker-like interface
    |  - Maps port.onMessage -> onmessage setter
    |  - Maps postMessage -> port.postMessage
    |
    v
chrome.runtime.Port
    |  - Long-lived connection to background service worker
    |  - Name: "web_llm_service_worker"
    |
    v
ServiceWorkerMLCEngineHandler (in background.ts)
    |  - Receives messages via port.onMessage
    |  - Routes to MLCEngine methods
    |  - Caches loaded model state
    |
    v
MLCEngine (actual inference engine)
    |  - WebGPU/WASM model execution
    |  - TVM runtime

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment