Implementation:Run llama Llama index FunctionCallingLLM

Knowledge Sources	Run_llama_Llama_index
Domains	LLM Integration, Function Calling, Tool Use
Last Updated	2026-02-11 19:00 GMT

Overview

This module defines the FunctionCallingLLM abstract base class, which extends the base LLM class with function/tool calling capabilities including chat with tools, streaming with tools, and predict-and-call workflows.

Description

The function_calling.py module provides FunctionCallingLLM, an abstract subclass of LLM that adds structured function calling support. This is the base class for all LLM integrations that support native tool/function calling (e.g., OpenAI, Anthropic, Google models).

FunctionCallingLLM provides several key method groups:

Chat with Tools Methods:

chat_with_tools - Synchronous method that prepares tool-augmented chat arguments via _prepare_chat_with_tools_compat, calls self.chat, and validates the response
achat_with_tools - Asynchronous counterpart using self.achat
stream_chat_with_tools - Streaming variant using self.stream_chat (no validation for streaming outputs)
astream_chat_with_tools - Async streaming variant using self.astream_chat

All four methods accept tools (a sequence of BaseTool), optional user_msg (string or ChatMessage), optional chat_history, verbose flag, allow_parallel_tool_calls flag, and tool_required flag.

Tool Preparation Methods:

_prepare_chat_with_tools - Abstract method that subclasses must implement to convert tools into the LLM-specific format and return a kwargs dictionary suitable for self.chat
_prepare_chat_with_tools_compat - Compatibility wrapper that checks (using the cached _supports_tool_required helper) whether the subclass's _prepare_chat_with_tools implementation supports the tool_required parameter, and omits it if not supported (with a logged warning)
_validate_chat_with_tools_response - Hook for subclasses to validate the chat response (default is passthrough)

Tool Extraction:

get_tool_calls_from_response - Extracts ToolSelection objects from a ChatResponse. Raises NotImplementedError by default; subclasses must override.

Predict and Call Methods:

predict_and_call - End-to-end synchronous method that calls chat_with_tools, extracts tool calls, executes them using call_tool_with_selection, and returns an AgentChatResponse. Handles parallel tool calls by concatenating outputs. Falls back to the parent LLM.predict_and_call if the model does not report is_function_calling_model.
apredict_and_call - Async counterpart that uses achat_with_tools and asyncio.gather for parallel tool execution via acall_tool_with_selection.

Both predict_and_call methods support error_on_no_tool_call and error_on_tool_error flags for controlling error behavior.

The module-level _supports_tool_required function is decorated with @functools.lru_cache(maxsize=1000) and uses inspect.signature to check whether a given subclass's _prepare_chat_with_tools method includes the tool_required parameter, providing backward compatibility with older LLM integrations.

Usage

Subclass FunctionCallingLLM when implementing an LLM integration that supports native function/tool calling. Implement _prepare_chat_with_tools and get_tool_calls_from_response at minimum. Use predict_and_call or apredict_and_call for end-to-end tool usage workflows in agents. Use chat_with_tools for lower-level control over the tool calling process.

Code Reference

Source Location

Repository: Run_llama_Llama_index
File: llama-index-core/llama_index/core/llms/function_calling.py
Lines: 1-347

Signature

class FunctionCallingLLM(LLM):
    def __init__(self, *args: Any, **kwargs: Any) -> None: ...

    def chat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> ChatResponse: ...

    async def achat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> ChatResponse: ...

    @abstractmethod
    def _prepare_chat_with_tools(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        tool_required: bool = False,
        **kwargs: Any,
    ) -> Dict[str, Any]: ...

    def get_tool_calls_from_response(
        self,
        response: ChatResponse,
        error_on_no_tool_call: bool = True,
        **kwargs: Any,
    ) -> List[ToolSelection]: ...

    def predict_and_call(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        error_on_no_tool_call: bool = True,
        error_on_tool_error: bool = False,
        **kwargs: Any,
    ) -> "AgentChatResponse": ...

    async def apredict_and_call(
        self,
        tools: Sequence["BaseTool"],
        user_msg: Optional[Union[str, ChatMessage]] = None,
        chat_history: Optional[List[ChatMessage]] = None,
        verbose: bool = False,
        allow_parallel_tool_calls: bool = False,
        error_on_no_tool_call: bool = True,
        error_on_tool_error: bool = False,
        **kwargs: Any,
    ) -> "AgentChatResponse": ...

@functools.lru_cache(maxsize=1000)
def _supports_tool_required(
    cls: Type[FunctionCallingLLM], tool_required: bool
) -> bool: ...

Import

from llama_index.core.llms.function_calling import FunctionCallingLLM

I/O Contract

Inputs

Name	Type	Required	Description
tools	Sequence[BaseTool]	Yes	The tools/functions available for the LLM to call
user_msg	Optional[Union[str, ChatMessage]]	No	The user message to send to the LLM
chat_history	Optional[List[ChatMessage]]	No	Previous chat messages for context
verbose	bool	No	Whether to print verbose output (default False)
allow_parallel_tool_calls	bool	No	Whether to allow the LLM to call multiple tools in one turn (default False)
tool_required	bool	No	If True, the LLM should only call tools and not return a direct text response (default False)
error_on_no_tool_call	bool	No	Whether to raise an error if no tool call is found in the response (default True)
error_on_tool_error	bool	No	Whether to raise an error if a tool call returns an error (default False)
**kwargs	Any	No	Additional keyword arguments passed to the underlying chat method

Outputs

Name	Type	Description
return (chat_with_tools)	ChatResponse	The LLM chat response potentially containing tool call information
return (stream_chat_with_tools)	ChatResponseGen	A synchronous generator of streaming chat response chunks
return (astream_chat_with_tools)	ChatResponseAsyncGen	An async generator of streaming chat response chunks
return (get_tool_calls_from_response)	List[ToolSelection]	List of tool selections extracted from the LLM response
return (predict_and_call)	AgentChatResponse	The agent response containing tool output text and source tool outputs

Usage Examples

Basic Usage

from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.core.tools import FunctionTool

# Define a tool
def multiply(a: int, b: int) -> int:
    """Multiply two integers and return the result."""
    return a * b

tool = FunctionTool.from_defaults(fn=multiply)

# Use with an OpenAI model (which extends FunctionCallingLLM)
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4")

# Chat with tools - low level
response = llm.chat_with_tools(
    tools=[tool],
    user_msg="What is 6 times 7?",
    verbose=True,
)
tool_calls = llm.get_tool_calls_from_response(response)

# Predict and call - end to end
agent_response = llm.predict_and_call(
    tools=[tool],
    user_msg="What is 6 times 7?",
    verbose=True,
)
print(agent_response.response)  # "42"

Async Parallel Tool Calls

import asyncio
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI

def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

tools = [
    FunctionTool.from_defaults(fn=add),
    FunctionTool.from_defaults(fn=multiply),
]

llm = OpenAI(model="gpt-4")

async def main():
    response = await llm.apredict_and_call(
        tools=tools,
        user_msg="Add 3 and 5, and also multiply 4 and 6.",
        allow_parallel_tool_calls=True,
        verbose=True,
    )
    print(response.response)
    print(response.sources)  # List of ToolOutput objects

asyncio.run(main())

Related Pages

Environment:Run_llama_Llama_index_Python_LlamaIndex_Core

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment