Implementation:Mlc ai Web llm Web Worker MLC Engine Handler
Overview
WebWorkerMLCEngineHandler is the concrete class provided by @mlc-ai/web-llm for handling LLM inference requests inside a Web Worker. It creates an internal MLCEngine, sets up an InitProgressCallback that forwards loading progress via postMessage, and provides an onmessage handler that routes incoming WorkerRequest messages to the appropriate engine methods.
Description
The WebWorkerMLCEngineHandler class is designed to run inside a Web Worker script. On construction, it instantiates a private MLCEngine and registers an init progress callback that serializes InitProgressReport objects back to the main thread. The core routing logic lives in the onmessage method, which uses a switch statement on msg.kind to dispatch to engine methods.
Key internal state:
engine: MLCEngine-- The actual inference engine running in the workermodelId?: string[]-- Currently loaded model IDs (kept in sync with the engine)chatOpts?: ChatOptions[]-- Chat options for each loaded modelloadedModelIdToAsyncGenerator: Map<string, AsyncGenerator>-- Per-model streaming generators
The handler also implements a reloadIfUnmatched guard: before processing inference requests, it checks whether the expected model (sent by the main-thread proxy) matches the actually loaded model. If not (e.g., due to an unexpectedly killed service worker), it automatically reloads the correct model. This provides resilience against browser-level worker lifecycle issues.
Code Reference
Source: src/web_worker.ts, Lines 61-378
export class WebWorkerMLCEngineHandler {
modelId?: string[];
chatOpts?: ChatOptions[];
public engine: MLCEngine;
protected loadedModelIdToAsyncGenerator: Map<
string,
AsyncGenerator<ChatCompletionChunk | Completion, void, void>
>;
constructor();
postMessage(msg: any): void;
setLogitProcessorRegistry(logitProcessorRegistry?: Map<string, LogitProcessor>): void;
handleTask<T extends MessageContent>(uuid: string, task: () => Promise<T>): Promise<void>;
onmessage(event: any, onComplete?: (value: any) => void, onError?: () => void): void;
reloadIfUnmatched(expectedModelId: string[], expectedChatOpts?: ChatOptions[]): Promise<void>;
}
I/O Contract
Input: WorkerRequest messages received via the Web Worker's onmessage event. Each request has:
kind-- ARequestKindstring identifying the operationuuid-- A unique identifier for correlating requests with responsescontent-- AMessageContentunion type carrying operation-specific parameters
Output: WorkerResponse messages sent via postMessage. Each response has:
kind-- Either"return"(success),"throw"(error), or"initProgressCallback"(progress)uuid-- Matches the originating request'suuidcontent-- The result value, error string, orInitProgressReport
Error Handling: All errors thrown during task execution are caught in handleTask and serialized as "throw"-kind responses with the error stringified as content. Unknown message kinds throw an UnknownMessageKindError.
Import
import { WebWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
Usage Examples
Setting up a worker script (worker.ts):
// worker.ts
import { WebWorkerMLCEngineHandler } from "@mlc-ai/web-llm";
const handler = new WebWorkerMLCEngineHandler();
// Route all incoming messages to the handler
self.onmessage = (msg: MessageEvent) => {
handler.onmessage(msg);
};
Setting up a worker with a custom logit processor:
// worker.ts
import { WebWorkerMLCEngineHandler, LogitProcessor } from "@mlc-ai/web-llm";
const myLogitProcessor: LogitProcessor = {
processLogits: (logits: Float32Array) => {
// Custom logit processing logic
return logits;
},
processSampledToken: (token: number) => { /* update state */ },
resetState: () => { /* reset internal state */ },
};
const handler = new WebWorkerMLCEngineHandler();
handler.setLogitProcessorRegistry(
new Map([["my-model-id", myLogitProcessor]])
);
self.onmessage = (msg: MessageEvent) => {
handler.onmessage(msg);
};
Corresponding main thread code:
// main.ts
import { CreateWebWorkerMLCEngine } from "@mlc-ai/web-llm";
const worker = new Worker(
new URL("./worker.ts", import.meta.url),
{ type: "module" }
);
const engine = await CreateWebWorkerMLCEngine(
worker,
"Llama-3.1-8B-Instruct-q4f16_1-MLC",
{
initProgressCallback: (progress) => {
console.log(`Loading: ${(progress.progress * 100).toFixed(1)}%`);
},
}
);
Related Pages
- Principle:Mlc_ai_Web_llm_Web_Worker_Engine_Handler -- The principle this implements
- Implementation:Mlc_ai_Web_llm_Create_Web_Worker_MLC_Engine -- The factory function that creates the main-thread proxy
- Implementation:Mlc_ai_Web_llm_Web_Worker_Chat_Completion -- Chat completion forwarding
- Implementation:Mlc_ai_Web_llm_Async_Generate -- Streaming generator implementation