Principle:Mlc ai Web llm Tool Choice Configuration

Overview

Tool Choice Configuration is the technique of controlling whether and which tools a language model should invoke when processing a request. It allows developers to switch between letting the model decide autonomously, forcing a specific function call, or disabling tool use entirely.

Description

Tool choice configuration governs the model's tool-calling behavior through three distinct modes:

"none" -- The model will never call a function and will always generate a text message instead. This is the default when no tools are provided in the request.
"auto" -- The model decides whether to call a function or generate a text message based on the user query and available tools. This is the default when tools are present in the request.
Named tool choice -- Forces the model to call a specific function by name, specified via a ChatCompletionNamedToolChoice object with the structure {"type": "function", "function": {"name": "my_function"}}.

When tools are provided in the request, the web-llm engine automatically performs several configuration steps:

Model validation -- Checks that the current model is in the functionCallingModelIds list.
System prompt injection -- Prepends the Hermes-2 function calling system prompt with tool definitions wrapped in <tools></tools> XML tags.
Response format enforcement -- Sets response_format to json_object with the official Hermes-2 function call schema array, ensuring the model produces valid JSON output.

These steps happen transparently in the request preprocessing pipeline, so the user only needs to set tool_choice and tools.

Usage

Choosing the right mode:

Use "auto" (default when tools present) to let the model decide whether to call a tool or respond with text. This is appropriate for most conversational agents.
Use named tool choice to force invocation of a specific function. This is useful when you know exactly which function should be called (e.g., after validation or in a deterministic workflow).
Use "none" to disable tool calling even when tools are defined. This is useful when you want to temporarily suppress tool use in a multi-turn conversation where tools were previously available.

Constraints:

When tools are provided for Hermes-2-Pro models, the engine rejects custom response_format values (throws CustomResponseFormatError).
When tools are provided for Hermes-2-Pro models, the engine rejects custom system messages (throws CustomSystemPromptError).
Tool choice has no effect if the tools array is not provided or is empty.

Theoretical Basis

Tool choice configuration provides a control mechanism for the balance between model autonomy and application determinism:

Autonomous mode ("auto") -- The model applies its trained reasoning to determine if a tool call is the best response. This maximizes flexibility but may produce unexpected tool calls or miss intended tool use.
Forced mode (named) -- The application overrides the model's judgment and mandates a specific function call. This guarantees tool invocation but may result in suboptimal argument generation if the query does not naturally map to the forced function.
Disabled mode ("none") -- The application prevents all tool use, regardless of model intent. This is useful for follow-up turns where the model should synthesize results rather than make additional calls.

The default behavior follows the OpenAI convention: "none" when no functions are present, "auto" when functions are present. This ensures backward compatibility for non-tool-use requests while enabling tool use by default when tools are defined.

I/O Contract

Input:

tool_choice field on ChatCompletionRequest, typed as ChatCompletionToolChoiceOption:
- "none" -- string literal
- "auto" -- string literal
- ChatCompletionNamedToolChoice -- object with type: "function" and function: { name: string }

Behavior:

tool_choice value	tools provided	Model behavior
`"none"`	Yes/No	Model generates text only; no tool calls
`"auto"`	Yes	Model decides whether to call a tool or generate text
`"auto"`	No	Model generates text only (no tools available)
`{ type: "function", function: { name: "X" } }`	Yes (must include "X")	Model is forced to call function "X"
undefined	Yes	Defaults to `"auto"`
undefined	No	Defaults to `"none"`

Output:

When a tool is called: finish_reason is set to "tool_calls" and message.tool_calls is populated.
When no tool is called: finish_reason is "stop" and message.content contains the text response.

Usage Examples

Auto mode (default with tools):

import * as webllm from "@mlc-ai/web-llm";

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "What is the weather in Paris?" },
  ],
  tool_choice: "auto",  // Model decides whether to call a tool
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
};

const reply = await engine.chat.completions.create(request);
if (reply.choices[0].finish_reason === "tool_calls") {
  console.log("Model chose to call a tool:", reply.choices[0].message.tool_calls);
} else {
  console.log("Model responded with text:", reply.choices[0].message.content);
}

Forcing a specific function call:

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "Tell me about the weather" },
  ],
  tool_choice: {
    type: "function",
    function: { name: "get_weather" },
  },
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
};

const reply = await engine.chat.completions.create(request);
// Model is forced to call get_weather
console.log(reply.choices[0].message.tool_calls);

Disabling tool calling:

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "Summarize the weather data I gave you" },
  ],
  tool_choice: "none",  // Prevent tool calls even though tools are defined
  tools: tools,         // Tools still in the request but will not be invoked
};

const reply = await engine.chat.completions.create(request);
// Model generates text only
console.log(reply.choices[0].message.content);

Related Pages

Implementation:Mlc_ai_Web_llm_Tool_Choice_Request
Mlc_ai_Web_llm_Tool_Definition -- Defining the tools that tool_choice governs
Mlc_ai_Web_llm_Function_Calling_Model_Selection -- Model must support function calling
Mlc_ai_Web_llm_Tool_Call_Extraction -- Parsing tool calls when tool_choice results in invocation
Mlc_ai_Web_llm_Tool_Execution_Loop -- Executing the tool calls after model response

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment