Implementation:Mlc ai Web llm Tool Execution Pattern
Overview
Tool_Execution_Pattern implements the Mlc_ai_Web_llm_Tool_Execution_Loop principle by documenting the TypeScript interfaces and message types used to execute tool calls and format results for the conversation. This is a pattern document that covers the interface specifications for ChatCompletionMessageToolCall (the tool call output) and ChatCompletionToolMessageParam (the tool result input), as well as the ChatCompletionAssistantMessageParam that carries tool calls in the conversation history.
Source Reference
- ChatCompletionMessageToolCall:
src/openai_api_protocols/chat_completion.ts, Lines 649-683 - ChatCompletionToolMessageParam:
src/openai_api_protocols/chat_completion.ts, Lines 767-782 - ChatCompletionAssistantMessageParam:
src/openai_api_protocols/chat_completion.ts, Lines 742-765 - ChatCompletionMessageParam union:
src/openai_api_protocols/chat_completion.ts, Lines 784-788 - Example (OpenAI style):
examples/function-calling/function-calling-openai/src/function_calling_openai.ts - Example (Manual style):
examples/function-calling/function-calling-manual/src/function_calling_manual.ts - Repository: mlc-ai/web-llm
Code Reference
Tool Call Output (from the model)
// src/openai_api_protocols/chat_completion.ts L649-683
export interface ChatCompletionMessageToolCall {
/**
* The ID of the tool call. In WebLLM, it is used as the index of the tool call
* among all the tools calls in this request generation.
*/
id: string;
/**
* The function that the model called.
*/
function: ChatCompletionMessageToolCall.Function;
/**
* The type of the tool. Currently, only `function` is supported.
*/
type: "function";
}
export namespace ChatCompletionMessageToolCall {
/**
* The function that the model called.
*/
export interface Function {
/**
* The arguments to call the function with, as generated by the model in JSON
* format.
*/
arguments: string;
/**
* The name of the function to call.
*/
name: string;
}
}
Tool Result Input (from the application)
// src/openai_api_protocols/chat_completion.ts L767-782
export interface ChatCompletionToolMessageParam {
/**
* The contents of the tool message.
*/
content: string;
/**
* The role of the messages author, in this case `tool`.
*/
role: "tool";
/**
* Tool call that this message is responding to.
*/
tool_call_id: string;
}
Assistant Message with Tool Calls
// src/openai_api_protocols/chat_completion.ts L742-765
export interface ChatCompletionAssistantMessageParam {
/**
* The role of the messages author, in this case `assistant`.
*/
role: "assistant";
/**
* The contents of the assistant message. Required unless `tool_calls` is specified.
*/
content?: string | null;
/**
* An optional name for the participant.
*/
name?: string;
/**
* The tool calls generated by the model, such as function calls.
*/
tool_calls?: Array<ChatCompletionMessageToolCall>;
}
Message Param Union Type
// src/openai_api_protocols/chat_completion.ts L784-788
export type ChatCompletionMessageParam =
| ChatCompletionSystemMessageParam
| ChatCompletionUserMessageParam
| ChatCompletionAssistantMessageParam
| ChatCompletionToolMessageParam;
I/O Contract
The tool execution pattern involves three message types in sequence:
| Step | Message Type | Role | Key Fields | Source |
|---|---|---|---|---|
| 1. Model generates tool call | ChatCompletionMessageToolCall |
(inside assistant message) | id, function.name, function.arguments |
Engine output |
| 2. Assistant message with tool calls | ChatCompletionAssistantMessageParam |
"assistant" |
content: null, tool_calls: [...] |
App adds to history |
| 3. Tool result message | ChatCompletionToolMessageParam |
"tool" |
content, tool_call_id |
App adds to history |
Critical invariants:
tool_call_idinChatCompletionToolMessageParammust match theidfield of the correspondingChatCompletionMessageToolCall.- In web-llm, tool call IDs are array indices as strings:
"0","1", etc. ChatCompletionToolMessageParam.contentmust be a string. Serialize objects withJSON.stringify().- When an assistant message has
tool_calls, itscontentshould benull.
Conversation message ordering:
messages = [
{ role: "system", content: "..." }, // System prompt (auto-injected for Hermes-2-Pro)
{ role: "user", content: "..." }, // User query
{ role: "assistant", content: null, // Assistant response with tool calls
tool_calls: [{ id: "0", function: {...} }] },
{ role: "tool", content: "...", // Tool result for call "0"
tool_call_id: "0" },
{ role: "assistant", content: "Final answer" }, // Final text response
]
Usage Examples
Complete pattern with OpenAI-style function calling:
import * as webllm from "@mlc-ai/web-llm";
const engine = await webllm.CreateMLCEngine(
"Hermes-2-Pro-Llama-3-8B-q4f16_1-MLC",
);
const tools: Array<webllm.ChatCompletionTool> = [
{
type: "function",
function: {
name: "get_current_weather",
description: "Get the current weather in a given location",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "The city and state, e.g. San Francisco, CA",
},
unit: { type: "string", enum: ["celsius", "fahrenheit"] },
},
required: ["location"],
},
},
},
];
// Step 1: Send initial request
const messages: webllm.ChatCompletionMessageParam[] = [
{
role: "user",
content: "What is the current weather in celsius in Pittsburgh and Tokyo?",
},
];
const reply = await engine.chat.completions.create({
stream: false,
messages: messages,
tool_choice: "auto",
tools: tools,
});
// Step 2: Process tool calls
if (reply.choices[0].finish_reason === "tool_calls") {
const toolCalls = reply.choices[0].message.tool_calls!;
// Step 2a: Add assistant message with tool_calls to conversation history
messages.push({
role: "assistant",
content: null,
tool_calls: toolCalls,
});
// Step 2b: Execute each tool and add result messages
for (const toolCall of toolCalls) {
const args = JSON.parse(toolCall.function.arguments);
// Execute the actual function
let result: string;
if (toolCall.function.name === "get_current_weather") {
result = JSON.stringify({
location: args.location,
temperature: args.location.includes("Pittsburgh") ? 18.5 : 25.0,
unit: args.unit || "celsius",
});
} else {
result = JSON.stringify({ error: `Unknown function: ${toolCall.function.name}` });
}
// Add tool result message -- tool_call_id MUST match toolCall.id
messages.push({
role: "tool",
content: result,
tool_call_id: toolCall.id,
});
}
// Step 3: Get final natural language response
const finalReply = await engine.chat.completions.create({
stream: false,
messages: messages,
tool_choice: "auto",
tools: tools,
});
console.log(finalReply.choices[0].message.content);
// "The current weather in Pittsburgh is 18.5 degrees Celsius,
// and in Tokyo it is 25.0 degrees Celsius."
}
Streaming with tool calls (from the official example):
const request: webllm.ChatCompletionRequest = {
stream: true,
stream_options: { include_usage: true },
messages: [
{
role: "user",
content: "What is the current weather in celsius in Pittsburgh and Tokyo?",
},
],
tool_choice: "auto",
tools: tools,
};
const asyncChunkGenerator = await engine.chat.completions.create(request);
let message = "";
let lastChunk: webllm.ChatCompletionChunk | undefined;
for await (const chunk of asyncChunkGenerator) {
message += chunk.choices[0]?.delta?.content || "";
if (!chunk.usage) {
lastChunk = chunk;
}
}
// The last non-usage chunk contains tool_calls in its delta
if (lastChunk?.choices[0]?.delta?.tool_calls) {
const toolCalls = lastChunk.choices[0].delta.tool_calls;
for (const call of toolCalls) {
console.log(`Function: ${call.function?.name}`);
console.log(`Arguments: ${call.function?.arguments}`);
console.log(`Index: ${call.index}`);
}
}
Manual function calling pattern (Hermes-2 style):
// From examples/function-calling/function-calling-manual
// Turn 1: Model outputs tool call as text content (manual parsing)
const reply1 = await engine.chat.completions.create({
stream: false,
messages: messages,
});
const response1 = reply1.choices[0].message.content;
messages.push({ role: "assistant", content: response1 });
// Turn 2: Provide tool response with tool_call_id
const toolResponse = JSON.stringify({
symbol: "TSLA",
company_name: "Tesla, Inc.",
sector: "Consumer Cyclical",
});
messages.push({
role: "tool",
content: toolResponse,
tool_call_id: "0", // Matches the first (and only) tool call
});
// Turn 3: Model synthesizes natural language answer
const reply2 = await engine.chat.completions.create({
stream: false,
messages: messages,
});
console.log(reply2.choices[0].message.content);
Handling tool execution errors:
for (const toolCall of toolCalls) {
let result: string;
try {
const args = JSON.parse(toolCall.function.arguments);
result = await executeFunction(toolCall.function.name, args);
} catch (error) {
// Return the error as a tool message so the model can adapt
result = JSON.stringify({
error: true,
message: `Function ${toolCall.function.name} failed: ${error}`,
});
}
messages.push({
role: "tool",
content: result,
tool_call_id: toolCall.id,
});
}
Related Pages
- Principle:Mlc_ai_Web_llm_Tool_Execution_Loop
- Mlc_ai_Web_llm_Chat_Completion_Tool -- Tool definitions used in the execution pattern
- Mlc_ai_Web_llm_Tool_Choice_Request -- Tool choice configuration for each loop iteration
- Mlc_ai_Web_llm_Get_Tool_Call_From_Output -- How tool calls are extracted before execution
- Mlc_ai_Web_llm_Function_Calling_Model_Ids -- Models and system prompts that drive the pattern