Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Web llm Service Worker MLC Engine Handler

From Leeroopedia

Template:Metadata

Overview

Source code implementation of the ServiceWorkerMLCEngineHandler class, which extends WebWorkerMLCEngineHandler to operate within a Chrome Extension service worker context. This class replaces the Web Worker postMessage API with chrome.runtime.Port communication and adds model caching logic to avoid redundant model reloads when the popup reconnects.

Description

ServiceWorkerMLCEngineHandler is defined in src/extension_service_worker.ts at lines 34-101. It inherits all message routing, task handling, and engine management from WebWorkerMLCEngineHandler (defined in src/web_worker.ts) and overrides only the communication and reload logic.

The class is exported under two names from the package:

  • ServiceWorkerMLCEngineHandler (canonical name from src/extension_service_worker.ts)
  • ExtensionServiceWorkerMLCEngineHandler (alias from the package index for backward compatibility)

Code Reference

Source: src/extension_service_worker.ts, Lines 34-101

Class Signature

export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
  port: chrome.runtime.Port | null;

  constructor(port: chrome.runtime.Port);
  postMessage(msg: any): void;
  setPort(port: chrome.runtime.Port): void;
  onPortDisconnect(port: chrome.runtime.Port): void;
  onmessage(event: any): void;
}

Import

// Canonical import
import { ServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

// Backward-compatible alias
import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

Full Implementation

export class ServiceWorkerMLCEngineHandler extends WebWorkerMLCEngineHandler {
  port: chrome.runtime.Port | null;

  constructor(port: chrome.runtime.Port) {
    super();
    this.port = port;
    port.onDisconnect.addListener(() => this.onPortDisconnect(port));
  }

  postMessage(msg: any) {
    this.port?.postMessage(msg);
  }

  setPort(port: chrome.runtime.Port) {
    this.port = port;
    port.onDisconnect.addListener(() => this.onPortDisconnect(port));
  }

  onPortDisconnect(port: chrome.runtime.Port) {
    if (port === this.port) {
      this.port = null;
    }
  }

  onmessage(event: any): void {
    if (event.type === "keepAlive") {
      return;
    }

    const msg = event as WorkerRequest;
    if (msg.kind === "reload") {
      this.handleTask(msg.uuid, async () => {
        const params = msg.content as ReloadParams;
        // If the modelId, chatOpts, and appConfig are the same, immediately return
        if (
          areArraysEqual(this.modelId, params.modelId) &&
          areChatOptionsListEqual(this.chatOpts, params.chatOpts)
        ) {
          log.info("Already loaded the model. Skip loading");
          const gpuDetectOutput = await tvmjs.detectGPUDevice();
          if (gpuDetectOutput == undefined) {
            throw new WebGPUNotFoundError();
          }
          let gpuLabel = "WebGPU";
          if (gpuDetectOutput.adapterInfo.description.length != 0) {
            gpuLabel += " - " + gpuDetectOutput.adapterInfo.description;
          } else {
            gpuLabel += " - " + gpuDetectOutput.adapterInfo.vendor;
          }
          this.engine.getInitProgressCallback()?.({
            progress: 1,
            timeElapsed: 0,
            text: "Finish loading on " + gpuLabel,
          });
          return null;
        }

        await this.engine.reload(params.modelId, params.chatOpts);
        this.modelId = params.modelId;
        this.chatOpts = params.chatOpts;
        return null;
      });
      return;
    }

    // All rest of message handling are the same as WebWorkerMLCEngineHandler
    super.onmessage(event);
  }
}

I/O Contract

Constructor

Parameter Type Description
port chrome.runtime.Port The port from chrome.runtime.onConnect; used for all bidirectional communication

The constructor calls super() which creates a new MLCEngine instance and sets up the initProgressCallback to forward progress messages via postMessage.

postMessage(msg)

Parameter Type Description
msg any A WorkerResponse message to send to the popup via the port

Overrides WebWorkerMLCEngineHandler.postMessage() which uses the global postMessage. Uses optional chaining (this.port?.postMessage) so messages are silently dropped if the port is disconnected.

setPort(port)

Parameter Type Description
port chrome.runtime.Port A new port from a reconnecting popup

Updates the handler's port reference and registers a disconnect listener. Called when the popup reconnects to an already-initialized handler.

onmessage(event)

Parameter Type Description
event any Either a WorkerRequest object or a { type: "keepAlive" } heartbeat

Return behavior:

  • For keepAlive messages: returns immediately (no-op)
  • For reload messages with matching model: sends initProgressCallback with progress: 1, then sends return response
  • For reload messages with different model: delegates to engine.reload(), sends return response on completion
  • For all other message kinds: delegates to super.onmessage(event)

Usage Examples

Standard background script setup:

import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

let handler;

chrome.runtime.onConnect.addListener(function (port) {
  console.assert(port.name === "web_llm_service_worker");
  if (handler === undefined) {
    handler = new ExtensionServiceWorkerMLCEngineHandler(port);
  } else {
    handler.setPort(port);
  }
  port.onMessage.addListener(handler.onmessage.bind(handler));
});

Important: The .bind(handler) call is essential. Without it, this inside onmessage would refer to the port's listener context rather than the handler instance, causing this.engine and this.port to be undefined.

Background script with custom logit processor:

import { ExtensionServiceWorkerMLCEngineHandler } from "@mlc-ai/web-llm";

let handler;

chrome.runtime.onConnect.addListener(function (port) {
  if (handler === undefined) {
    handler = new ExtensionServiceWorkerMLCEngineHandler(port);
    // Register a custom logit processor for controlled generation
    handler.setLogitProcessorRegistry(
      new Map([
        ["myModel", myCustomLogitProcessor],
      ])
    );
  } else {
    handler.setPort(port);
  }
  port.onMessage.addListener(handler.onmessage.bind(handler));
});

Understanding the reload caching flow:

Popup opens (first time):
  1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
  2. This calls chrome.runtime.connect() -> port created
  3. background.ts receives port -> creates handler
  4. popup sends reload message with modelId
  5. handler.modelId is undefined -> performs full engine.reload()
  6. Model downloads, compiles, loads into GPU memory (~30s)
  7. handler.modelId set to ["Qwen2-0.5B-Instruct-q4f16_1-MLC"]

Popup closes and reopens (service worker still alive):
  1. popup.ts calls CreateExtensionServiceWorkerMLCEngine("Qwen2-0.5B-Instruct-q4f16_1-MLC")
  2. New port created -> handler.setPort(port)
  3. popup sends reload message with same modelId
  4. handler.modelId matches -> skips reload
  5. Immediately reports progress: 1 with GPU label
  6. Model is ready instantly

Internal Dependencies

Import Source Purpose
WebWorkerMLCEngineHandler src/web_worker.ts Base class providing engine creation, message routing, and task handling
tvmjs.detectGPUDevice @mlc-ai/web-runtime Used during skip-reload path to detect GPU and report label
areArraysEqual src/utils.ts Compares model ID arrays to determine if reload can be skipped
areChatOptionsListEqual src/utils.ts Compares chat options to determine if reload can be skipped
WebGPUNotFoundError src/error.ts Thrown when WebGPU is unavailable during skip-reload GPU detection
WorkerRequest, ReloadParams src/message.ts Message protocol types

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment