Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Mlc ai Web llm Tool Choice Configuration

From Leeroopedia

Template:Knowledge

Overview

Tool Choice Configuration is the technique of controlling whether and which tools a language model should invoke when processing a request. It allows developers to switch between letting the model decide autonomously, forcing a specific function call, or disabling tool use entirely.

Description

Tool choice configuration governs the model's tool-calling behavior through three distinct modes:

  • "none" -- The model will never call a function and will always generate a text message instead. This is the default when no tools are provided in the request.
  • "auto" -- The model decides whether to call a function or generate a text message based on the user query and available tools. This is the default when tools are present in the request.
  • Named tool choice -- Forces the model to call a specific function by name, specified via a ChatCompletionNamedToolChoice object with the structure {"type": "function", "function": {"name": "my_function"}}.

When tools are provided in the request, the web-llm engine automatically performs several configuration steps:

  1. Model validation -- Checks that the current model is in the functionCallingModelIds list.
  2. System prompt injection -- Prepends the Hermes-2 function calling system prompt with tool definitions wrapped in <tools></tools> XML tags.
  3. Response format enforcement -- Sets response_format to json_object with the official Hermes-2 function call schema array, ensuring the model produces valid JSON output.

These steps happen transparently in the request preprocessing pipeline, so the user only needs to set tool_choice and tools.

Usage

Choosing the right mode:

  • Use "auto" (default when tools present) to let the model decide whether to call a tool or respond with text. This is appropriate for most conversational agents.
  • Use named tool choice to force invocation of a specific function. This is useful when you know exactly which function should be called (e.g., after validation or in a deterministic workflow).
  • Use "none" to disable tool calling even when tools are defined. This is useful when you want to temporarily suppress tool use in a multi-turn conversation where tools were previously available.

Constraints:

  • When tools are provided for Hermes-2-Pro models, the engine rejects custom response_format values (throws CustomResponseFormatError).
  • When tools are provided for Hermes-2-Pro models, the engine rejects custom system messages (throws CustomSystemPromptError).
  • Tool choice has no effect if the tools array is not provided or is empty.

Theoretical Basis

Tool choice configuration provides a control mechanism for the balance between model autonomy and application determinism:

  • Autonomous mode ("auto") -- The model applies its trained reasoning to determine if a tool call is the best response. This maximizes flexibility but may produce unexpected tool calls or miss intended tool use.
  • Forced mode (named) -- The application overrides the model's judgment and mandates a specific function call. This guarantees tool invocation but may result in suboptimal argument generation if the query does not naturally map to the forced function.
  • Disabled mode ("none") -- The application prevents all tool use, regardless of model intent. This is useful for follow-up turns where the model should synthesize results rather than make additional calls.

The default behavior follows the OpenAI convention: "none" when no functions are present, "auto" when functions are present. This ensures backward compatibility for non-tool-use requests while enabling tool use by default when tools are defined.

I/O Contract

Input:

  • tool_choice field on ChatCompletionRequest, typed as ChatCompletionToolChoiceOption:
    • "none" -- string literal
    • "auto" -- string literal
    • ChatCompletionNamedToolChoice -- object with type: "function" and function: { name: string }

Behavior:

tool_choice value tools provided Model behavior
"none" Yes/No Model generates text only; no tool calls
"auto" Yes Model decides whether to call a tool or generate text
"auto" No Model generates text only (no tools available)
{ type: "function", function: { name: "X" } } Yes (must include "X") Model is forced to call function "X"
undefined Yes Defaults to "auto"
undefined No Defaults to "none"

Output:

  • When a tool is called: finish_reason is set to "tool_calls" and message.tool_calls is populated.
  • When no tool is called: finish_reason is "stop" and message.content contains the text response.

Usage Examples

Auto mode (default with tools):

import * as webllm from "@mlc-ai/web-llm";

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "What is the weather in Paris?" },
  ],
  tool_choice: "auto",  // Model decides whether to call a tool
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
};

const reply = await engine.chat.completions.create(request);
if (reply.choices[0].finish_reason === "tool_calls") {
  console.log("Model chose to call a tool:", reply.choices[0].message.tool_calls);
} else {
  console.log("Model responded with text:", reply.choices[0].message.content);
}

Forcing a specific function call:

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "Tell me about the weather" },
  ],
  tool_choice: {
    type: "function",
    function: { name: "get_weather" },
  },
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
};

const reply = await engine.chat.completions.create(request);
// Model is forced to call get_weather
console.log(reply.choices[0].message.tool_calls);

Disabling tool calling:

const request: webllm.ChatCompletionRequest = {
  stream: false,
  messages: [
    { role: "user", content: "Summarize the weather data I gave you" },
  ],
  tool_choice: "none",  // Prevent tool calls even though tools are defined
  tools: tools,         // Tools still in the request but will not be invoked
};

const reply = await engine.chat.completions.create(request);
// Model generates text only
console.log(reply.choices[0].message.content);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment