Implementation:Mlc ai Web llm Service Worker MLC Engine Handler

Overview

Source code implementation of the ServiceWorkerMLCEngineHandler class, which extends WebWorkerMLCEngineHandler to operate within a Chrome Extension service worker context. This class replaces the Web Worker postMessage API with chrome.runtime.Port communication and adds model caching logic to avoid redundant model reloads when the popup reconnects.

Description

ServiceWorkerMLCEngineHandler is defined in src/extension_service_worker.ts at lines 34-101. It inherits all message routing, task handling, and engine management from WebWorkerMLCEngineHandler (defined in src/web_worker.ts) and overrides only the communication and reload logic.

The class is exported under two names from the package:

ServiceWorkerMLCEngineHandler (canonical name from src/extension_service_worker.ts)
ExtensionServiceWorkerMLCEngineHandler (alias from the package index for backward compatibility)

Code Reference

Source: src/extension_service_worker.ts, Lines 34-101

Class Signature

export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
  port: chrome.runtime.Port | null;

  constructor(port: chrome.runtime.Port);
  postMessage(msg: any): void;
  setPort(port: chrome.runtime.Port): void;
  onPortDisconnect(port: chrome.runtime.Port): void;
  onmessage(event: any): void;
}

Import

// Canonical import
import { ServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

// Backward-compatible alias
import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

Full Implementation

export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
  port: chrome.runtime.Port | null;

  constructor(port: chrome.runtime.Port) {
    super();
    this.port = port;
    port.onDisconnect.addListener(() => this.onPortDisconnect(port));
  }

  postMessage(msg: any) {
    this.port?.postMessage(msg);
  }

  setPort(port: chrome.runtime.Port) {
    this.port = port;
    port.onDisconnect.addListener(() => this.onPortDisconnect(port));
  }

  onPortDisconnect(port: chrome.runtime.Port) {
    if (port === this.port) {
      this.port = null;
    }
  }

  onmessage(event: any): void {
    if (event.type === "keepAlive") {
      return;
    }

    const msg = event as WorkerRequest;
    if (msg.kind === "reload") {
      this.handleTask(msg.uuid, async () => {
        const params = msg.content as ReloadParams;
        // If the modelId, chatOpts, and appConfig are the same, immediately return
        if (
          areArraysEqual(this.modelId, params.modelId) &&
          areChatOptionsListEqual(this.chatOpts, params.chatOpts)
        ) {
          log.info("Already loaded the model. Skip loading");
          const gpuDetectOutput = await tvmjs.detectGPUDevice();
          if (gpuDetectOutput == undefined) {
            throw new WebGPUNotFoundError();
          }
          let gpuLabel = "WebGPU";
          if (gpuDetectOutput.adapterInfo.description.length != 0) {
            gpuLabel += " - " + gpuDetectOutput.adapterInfo.description;
          } else {
            gpuLabel += " - " + gpuDetectOutput.adapterInfo.vendor;
          }
          this.engine.getInitProgressCallback()?.({
            progress: 1,
            timeElapsed: 0,
            text: "Finish loading on " + gpuLabel,
          });
          return null;
        }

        await this.engine.reload(params.modelId, params.chatOpts);
        this.modelId = params.modelId;
        this.chatOpts = params.chatOpts;
        return null;
      });
      return;
    }

    // All rest of message handling are the same as WebWorkerMLCEngineHandler
    super.onmessage(event);
  }
}

I/O Contract

Constructor

Parameter	Type	Description
`port`	`chrome.runtime.Port`	The port from `chrome.runtime.onConnect`; used for all bidirectional communication

The constructor calls super() which creates a new MLCEngine instance and sets up the initProgressCallback to forward progress messages via postMessage.

postMessage(msg)

Parameter	Type	Description
`msg`	`any`	A `WorkerResponse` message to send to the popup via the port

Overrides WebWorkerMLCEngineHandler.postMessage() which uses the global postMessage. Uses optional chaining (this.port?.postMessage) so messages are silently dropped if the port is disconnected.

setPort(port)

Parameter	Type	Description
`port`	`chrome.runtime.Port`	A new port from a reconnecting popup

Updates the handler's port reference and registers a disconnect listener. Called when the popup reconnects to an already-initialized handler.

onmessage(event)

Parameter	Type	Description
`event`	`any`	Either a `WorkerRequest` object or a `{ type: "keepAlive" }` heartbeat

Return behavior:

For keepAlive messages: returns immediately (no-op)
For reload messages with matching model: sends initProgressCallback with progress: 1, then sends return response
For reload messages with different model: delegates to engine.reload(), sends return response on completion
For all other message kinds: delegates to super.onmessage(event)

Usage Examples

Standard background script setup:

import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

let handler;

chrome.runtime.onConnect.addListener(function (port) {
  console.assert(port.name === "web_llm_service_worker");
  if (handler === undefined) {
    handler = new ExtensionServiceWorkerMLCEngineHandler(port);
  } else {
    handler.setPort(port);
  }
  port.onMessage.addListener(handler.onmessage.bind(handler));
});

Important: The .bind(handler) call is essential. Without it, this inside onmessage would refer to the port's listener context rather than the handler instance, causing this.engine and this.port to be undefined.

Background script with custom logit processor:

import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

let handler;

chrome.runtime.onConnect.addListener(function (port) {
  if (handler === undefined) {
    handler = new ExtensionServiceWorkerMLCEngineHandler(port);
    // Register a custom logit processor for controlled generation
    handler.setLogitProcessorRegistry(
      new Map([
        ["myModel", myCustomLogitProcessor],
      ])
    );
  } else {
    handler.setPort(port);
  }
  port.onMessage.addListener(handler.onmessage.bind(handler));
});

Understanding the reload caching flow:

Popup opens (first time):
  1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
  2. This calls chrome.runtime.connect() -> port created
  3. background.ts receives port -> creates handler
  4. popup sends reload message with modelId
  5. handler.modelId is undefined -> performs full engine.reload()
  6. Model downloads, compiles, loads into GPU memory (~30s)
  7. handler.modelId set to ["Qwen2-0.5B-Instruct-q4f16_1-MLC"]

Popup closes and reopens (service worker still alive):
  1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
  2. New port created -> handler.setPort(port)
  3. popup sends reload message with same modelId
  4. handler.modelId matches -> skips reload
  5. Immediately reports progress: 1 with GPU label
  6. Model is ready instantly

Internal Dependencies

Import	Source	Purpose
`WebWorkerMLCEngineHandler`	`src/web_worker.ts`	Base class providing engine creation, message routing, and task handling
`tvmjs.detectGPUDevice`	`@mlc-ai/web-runtime`	Used during skip-reload path to detect GPU and report label
`areArraysEqual`	`src/utils.ts`	Compares model ID arrays to determine if reload can be skipped
`areChatOptionsListEqual`	`src/utils.ts`	Compares chat options to determine if reload can be skipped
`WebGPUNotFoundError`	`src/error.ts`	Thrown when WebGPU is unavailable during skip-reload GPU detection
`WorkerRequest`, `ReloadParams`	`src/message.ts`	Message protocol types

Related Pages

Principle:Mlc_ai_Web_llm_Extension_Service_Worker
Mlc_ai_Web_llm_Manifest_V3_Configuration - Manifest that registers the service worker
Mlc_ai_Web_llm_Create_Service_Worker_MLC_Engine - Popup-side factory that connects to this handler
Mlc_ai_Web_llm_Chrome_Extension_Manifest - Configuration principle for the extension manifest
Environment:Mlc_ai_Web_llm_Chrome_Extension_Manifest_V3
Heuristic:Mlc_ai_Web_llm_Service_Worker_Keep_Alive

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment