Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Web llm Tool Execution Pattern

From Leeroopedia
Revision as of 13:17, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Mlc_ai_Web_llm_Tool_Execution_Pattern.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Template:Knowledge

Overview

Tool_Execution_Pattern implements the Mlc_ai_Web_llm_Tool_Execution_Loop principle by documenting the TypeScript interfaces and message types used to execute tool calls and format results for the conversation. This is a pattern document that covers the interface specifications for ChatCompletionMessageToolCall (the tool call output) and ChatCompletionToolMessageParam (the tool result input), as well as the ChatCompletionAssistantMessageParam that carries tool calls in the conversation history.

Source Reference

  • ChatCompletionMessageToolCall: src/openai_api_protocols/chat_completion.ts, Lines 649-683
  • ChatCompletionToolMessageParam: src/openai_api_protocols/chat_completion.ts, Lines 767-782
  • ChatCompletionAssistantMessageParam: src/openai_api_protocols/chat_completion.ts, Lines 742-765
  • ChatCompletionMessageParam union: src/openai_api_protocols/chat_completion.ts, Lines 784-788
  • Example (OpenAI style): examples/function-calling/function-calling-openai/src/function_calling_openai.ts
  • Example (Manual style): examples/function-calling/function-calling-manual/src/function_calling_manual.ts
  • Repository: mlc-ai/web-llm

Code Reference

Tool Call Output (from the model)

// src/openai_api_protocols/chat_completion.ts L649-683

export interface ChatCompletionMessageToolCall {
  /**
   * The ID of the tool call. In WebLLM, it is used as the index of the tool call
   * among all the tools calls in this request generation.
   */
  id: string;

  /**
   * The function that the model called.
   */
  function: ChatCompletionMessageToolCall.Function;

  /**
   * The type of the tool. Currently, only `function` is supported.
   */
  type: "function";
}

export namespace ChatCompletionMessageToolCall {
  /**
   * The function that the model called.
   */
  export interface Function {
    /**
     * The arguments to call the function with, as generated by the model in JSON
     * format.
     */
    arguments: string;

    /**
     * The name of the function to call.
     */
    name: string;
  }
}

Tool Result Input (from the application)

// src/openai_api_protocols/chat_completion.ts L767-782

export interface ChatCompletionToolMessageParam {
  /**
   * The contents of the tool message.
   */
  content: string;

  /**
   * The role of the messages author, in this case `tool`.
   */
  role: "tool";

  /**
   * Tool call that this message is responding to.
   */
  tool_call_id: string;
}

Assistant Message with Tool Calls

// src/openai_api_protocols/chat_completion.ts L742-765

export interface ChatCompletionAssistantMessageParam {
  /**
   * The role of the messages author, in this case `assistant`.
   */
  role: "assistant";

  /**
   * The contents of the assistant message. Required unless `tool_calls` is specified.
   */
  content?: string | null;

  /**
   * An optional name for the participant.
   */
  name?: string;

  /**
   * The tool calls generated by the model, such as function calls.
   */
  tool_calls?: Array<ChatCompletionMessageToolCall>;
}

Message Param Union Type

// src/openai_api_protocols/chat_completion.ts L784-788

export type ChatCompletionMessageParam =
  | ChatCompletionSystemMessageParam
  | ChatCompletionUserMessageParam
  | ChatCompletionAssistantMessageParam
  | ChatCompletionToolMessageParam;

I/O Contract

The tool execution pattern involves three message types in sequence:

Step Message Type Role Key Fields Source
1. Model generates tool call ChatCompletionMessageToolCall (inside assistant message) id, function.name, function.arguments Engine output
2. Assistant message with tool calls ChatCompletionAssistantMessageParam "assistant" content: null, tool_calls: [...] App adds to history
3. Tool result message ChatCompletionToolMessageParam "tool" content, tool_call_id App adds to history

Critical invariants:

  • tool_call_id in ChatCompletionToolMessageParam must match the id field of the corresponding ChatCompletionMessageToolCall.
  • In web-llm, tool call IDs are array indices as strings: "0", "1", etc.
  • ChatCompletionToolMessageParam.content must be a string. Serialize objects with JSON.stringify().
  • When an assistant message has tool_calls, its content should be null.

Conversation message ordering:

messages = [
  { role: "system", content: "..." },              // System prompt (auto-injected for Hermes-2-Pro)
  { role: "user", content: "..." },                 // User query
  { role: "assistant", content: null,               // Assistant response with tool calls
    tool_calls: [{ id: "0", function: {...} }] },
  { role: "tool", content: "...",                   // Tool result for call "0"
    tool_call_id: "0" },
  { role: "assistant", content: "Final answer" },   // Final text response
]

Usage Examples

Complete pattern with OpenAI-style function calling:

import * as webllm from "@mlc-ai/web-llm";

const engine = await webllm.CreateMLCEngine(
  "Hermes-2-Pro-Llama-3-8B-q4f16_1-MLC",
);

const tools: Array<webllm.ChatCompletionTool> = [
  {
    type: "function",
    function: {
      name: "get_current_weather",
      description: "Get the current weather in a given location",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "The city and state, e.g. San Francisco, CA",
          },
          unit: { type: "string", enum: ["celsius", "fahrenheit"] },
        },
        required: ["location"],
      },
    },
  },
];

// Step 1: Send initial request
const messages: webllm.ChatCompletionMessageParam[] = [
  {
    role: "user",
    content: "What is the current weather in celsius in Pittsburgh and Tokyo?",
  },
];

const reply = await engine.chat.completions.create({
  stream: false,
  messages: messages,
  tool_choice: "auto",
  tools: tools,
});

// Step 2: Process tool calls
if (reply.choices[0].finish_reason === "tool_calls") {
  const toolCalls = reply.choices[0].message.tool_calls!;

  // Step 2a: Add assistant message with tool_calls to conversation history
  messages.push({
    role: "assistant",
    content: null,
    tool_calls: toolCalls,
  });

  // Step 2b: Execute each tool and add result messages
  for (const toolCall of toolCalls) {
    const args = JSON.parse(toolCall.function.arguments);

    // Execute the actual function
    let result: string;
    if (toolCall.function.name === "get_current_weather") {
      result = JSON.stringify({
        location: args.location,
        temperature: args.location.includes("Pittsburgh") ? 18.5 : 25.0,
        unit: args.unit || "celsius",
      });
    } else {
      result = JSON.stringify({ error: `Unknown function: ${toolCall.function.name}` });
    }

    // Add tool result message -- tool_call_id MUST match toolCall.id
    messages.push({
      role: "tool",
      content: result,
      tool_call_id: toolCall.id,
    });
  }

  // Step 3: Get final natural language response
  const finalReply = await engine.chat.completions.create({
    stream: false,
    messages: messages,
    tool_choice: "auto",
    tools: tools,
  });

  console.log(finalReply.choices[0].message.content);
  // "The current weather in Pittsburgh is 18.5 degrees Celsius,
  //  and in Tokyo it is 25.0 degrees Celsius."
}

Streaming with tool calls (from the official example):

const request: webllm.ChatCompletionRequest = {
  stream: true,
  stream_options: { include_usage: true },
  messages: [
    {
      role: "user",
      content: "What is the current weather in celsius in Pittsburgh and Tokyo?",
    },
  ],
  tool_choice: "auto",
  tools: tools,
};

const asyncChunkGenerator = await engine.chat.completions.create(request);
let message = "";
let lastChunk: webllm.ChatCompletionChunk | undefined;
for await (const chunk of asyncChunkGenerator) {
  message += chunk.choices[0]?.delta?.content || "";
  if (!chunk.usage) {
    lastChunk = chunk;
  }
}

// The last non-usage chunk contains tool_calls in its delta
if (lastChunk?.choices[0]?.delta?.tool_calls) {
  const toolCalls = lastChunk.choices[0].delta.tool_calls;
  for (const call of toolCalls) {
    console.log(`Function: ${call.function?.name}`);
    console.log(`Arguments: ${call.function?.arguments}`);
    console.log(`Index: ${call.index}`);
  }
}

Manual function calling pattern (Hermes-2 style):

// From examples/function-calling/function-calling-manual

// Turn 1: Model outputs tool call as text content (manual parsing)
const reply1 = await engine.chat.completions.create({
  stream: false,
  messages: messages,
});
const response1 = reply1.choices[0].message.content;
messages.push({ role: "assistant", content: response1 });

// Turn 2: Provide tool response with tool_call_id
const toolResponse = JSON.stringify({
  symbol: "TSLA",
  company_name: "Tesla, Inc.",
  sector: "Consumer Cyclical",
});
messages.push({
  role: "tool",
  content: toolResponse,
  tool_call_id: "0",   // Matches the first (and only) tool call
});

// Turn 3: Model synthesizes natural language answer
const reply2 = await engine.chat.completions.create({
  stream: false,
  messages: messages,
});
console.log(reply2.choices[0].message.content);

Handling tool execution errors:

for (const toolCall of toolCalls) {
  let result: string;
  try {
    const args = JSON.parse(toolCall.function.arguments);
    result = await executeFunction(toolCall.function.name, args);
  } catch (error) {
    // Return the error as a tool message so the model can adapt
    result = JSON.stringify({
      error: true,
      message: `Function ${toolCall.function.name} failed: ${error}`,
    });
  }

  messages.push({
    role: "tool",
    content: result,
    tool_call_id: toolCall.id,
  });
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment