Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index FunctionCallingLLM

From Leeroopedia
Knowledge Sources
Domains LLM Integration, Function Calling, Tool Use
Last Updated 2026-02-11 19:00 GMT

Overview

This module defines the FunctionCallingLLM abstract base class, which extends the base LLM class with function/tool calling capabilities including chat with tools, streaming with tools, and predict-and-call workflows.

Description

The function_calling.py module provides FunctionCallingLLM, an abstract subclass of LLM that adds structured function calling support. This is the base class for all LLM integrations that support native tool/function calling (e.g., OpenAI, Anthropic, Google models).

FunctionCallingLLM provides several key method groups:

Chat with Tools Methods:

  • chat_with_tools - Synchronous method that prepares tool-augmented chat arguments via _prepare_chat_with_tools_compat, calls self.chat, and validates the response
  • achat_with_tools - Asynchronous counterpart using self.achat
  • stream_chat_with_tools - Streaming variant using self.stream_chat (no validation for streaming outputs)
  • astream_chat_with_tools - Async streaming variant using self.astream_chat

All four methods accept tools (a sequence of BaseTool), optional user_msg (string or ChatMessage), optional chat_history, verbose flag, allow_parallel_tool_calls flag, and tool_required flag.

Tool Preparation Methods:

  • _prepare_chat_with_tools - Abstract method that subclasses must implement to convert tools into the LLM-specific format and return a kwargs dictionary suitable for self.chat
  • _prepare_chat_with_tools_compat - Compatibility wrapper that checks (using the cached _supports_tool_required helper) whether the subclass's _prepare_chat_with_tools implementation supports the tool_required parameter, and omits it if not supported (with a logged warning)
  • _validate_chat_with_tools_response - Hook for subclasses to validate the chat response (default is passthrough)

Tool Extraction:

  • get_tool_calls_from_response - Extracts ToolSelection objects from a ChatResponse. Raises NotImplementedError by default; subclasses must override.

Predict and Call Methods:

  • predict_and_call - End-to-end synchronous method that calls chat_with_tools, extracts tool calls, executes them using call_tool_with_selection, and returns an AgentChatResponse. Handles parallel tool calls by concatenating outputs. Falls back to the parent LLM.predict_and_call if the model does not report is_function_calling_model.
  • apredict_and_call - Async counterpart that uses achat_with_tools and asyncio.gather for parallel tool execution via acall_tool_with_selection.

Both predict_and_call methods support error_on_no_tool_call and error_on_tool_error flags for controlling error behavior.

The module-level _supports_tool_required function is decorated with @functools.lru_cache(maxsize=1000) and uses inspect.signature to check whether a given subclass's _prepare_chat_with_tools method includes the tool_required parameter, providing backward compatibility with older LLM integrations.

Usage

Subclass FunctionCallingLLM when implementing an LLM integration that supports native function/tool calling. Implement _prepare_chat_with_tools and get_tool_calls_from_response at minimum. Use predict_and_call or apredict_and_call for end-to-end tool usage workflows in agents. Use chat_with_tools for lower-level control over the tool calling process.

Code Reference

Source Location

  • Repository: Run_llama_Llama_index
  • File: llama-index-core/llama_index/core/llms/function_calling.py
  • Lines: 1-347

Signature

class FunctionCallingLLM(LLM):
    def __init__(self, *args: Any, **kwargs: Any) -> None: ...

    def chat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> ChatResponse: ...

    async def achat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> ChatResponse: ...

    @abstractmethod
    def _prepare_chat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> Dict[str, Any]: ...

    def get_tool_calls_from_response(
        self,
        response: ChatResponse,
        error_on_no_tool_call: bool = True,
        **kwargs: Any,
    ) -> List[ToolSelection]: ...

    def predict_and_call(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        error_on_no_tool_call: bool = True,
        error_on_tool_error: bool = False,
        **kwargs: Any,
    ) -> "AgentChatResponse": ...

    async def apredict_and_call(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        error_on_no_tool_call: bool = True,
        error_on_tool_error: bool = False,
        **kwargs: Any,
    ) -> "AgentChatResponse": ...

@functools.lru_cache(maxsize=1000)
def _supports_tool_required(
    cls: Type[FunctionCallingLLM], tool_required: bool
) -> bool: ...

Import

from llama_index.core.llms.function_calling import FunctionCallingLLM

I/O Contract

Inputs

Name Type Required Description
tools Sequence[BaseTool] Yes The tools/functions available for the LLM to call
user_msg Optional[Union[str, ChatMessage]] No The user message to send to the LLM
chat_history Optional[List[ChatMessage]] No Previous chat messages for context
verbose bool No Whether to print verbose output (default False)
allow_parallel_tool_calls bool No Whether to allow the LLM to call multiple tools in one turn (default False)
tool_required bool No If True, the LLM should only call tools and not return a direct text response (default False)
error_on_no_tool_call bool No Whether to raise an error if no tool call is found in the response (default True)
error_on_tool_error bool No Whether to raise an error if a tool call returns an error (default False)
**kwargs Any No Additional keyword arguments passed to the underlying chat method

Outputs

Name Type Description
return (chat_with_tools) ChatResponse The LLM chat response potentially containing tool call information
return (stream_chat_with_tools) ChatResponseGen A synchronous generator of streaming chat response chunks
return (astream_chat_with_tools) ChatResponseAsyncGen An async generator of streaming chat response chunks
return (get_tool_calls_from_response) List[ToolSelection] List of tool selections extracted from the LLM response
return (predict_and_call) AgentChatResponse The agent response containing tool output text and source tool outputs

Usage Examples

Basic Usage

from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.core.tools import FunctionTool

# Define a tool
def multiply(a: int, b: int) -> int:
    """Multiply two integers and return the result."""
    return a * b

tool = FunctionTool.from_defaults(fn=multiply)

# Use with an OpenAI model (which extends FunctionCallingLLM)
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4")

# Chat with tools - low level
response = llm.chat_with_tools(
    tools=[tool],
    user_msg="What is 6 times 7?",
    verbose=True,
)
tool_calls = llm.get_tool_calls_from_response(response)

# Predict and call - end to end
agent_response = llm.predict_and_call(
    tools=[tool],
    user_msg="What is 6 times 7?",
    verbose=True,
)
print(agent_response.response)  # "42"

Async Parallel Tool Calls

import asyncio
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI

def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

tools = [
    FunctionTool.from_defaults(fn=add),
    FunctionTool.from_defaults(fn=multiply),
]

llm = OpenAI(model="gpt-4")

async def main():
    response = await llm.apredict_and_call(
        tools=tools,
        user_msg="Add 3 and 5, and also multiply 4 and 6.",
        allow_parallel_tool_calls=True,
        verbose=True,
    )
    print(response.response)
    print(response.sources)  # List of ToolOutput objects

asyncio.run(main())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment