Implementation:Run llama Llama index MockLLM
| Knowledge Sources | |
|---|---|
| Domains | LLM, Testing, Mock |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
Provides mock LLM implementations (MockLLM, MockLLMWithChatMemoryOfLastCall, and MockFunctionCallingLLM) for use in testing and development without requiring actual LLM API calls.
Description
The mock.py module defines three mock LLM classes that extend the LlamaIndex LLM hierarchy for testing purposes:
- MockLLM extends CustomLLM and provides a simple deterministic LLM that either echoes the input prompt or generates repeated "text" tokens up to a configurable max_tokens limit. It supports both complete and stream_complete methods.
- MockLLMWithNonyieldingChatStream extends MockLLM and provides a stream_chat implementation that yields nothing, useful for testing empty stream handling.
- MockLLMWithChatMemoryOfLastCall extends MockLLM and records the last set of ChatMessage objects passed to any chat method (chat, stream_chat, achat, astream_chat), enabling test assertions on the messages that would have been sent to a real LLM.
- MockFunctionCallingLLM extends FunctionCallingLLM and simulates a function-calling LLM. It supports customizable response generation via a response_generator callback and a blocks_to_content_callback for converting multi-modal content blocks (text, images, audio, documents, video, thinking blocks, citations) into string content. It implements _prepare_chat_with_tools and get_tool_calls_from_response to simulate tool calling workflows.
The module also includes helper functions _data_from_binary for base64 decoding binary block data and _default_blocks_to_content_callback for serializing a list of ContentBlock objects into a single string.
Usage
Use these mock LLMs in unit tests, integration tests, and development scenarios where you want to verify pipeline behavior without incurring real LLM API costs or latency. MockLLM is suitable for simple completion testing, MockLLMWithChatMemoryOfLastCall for verifying chat message construction, and MockFunctionCallingLLM for testing tool-calling agent workflows.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/llms/mock.py
- Lines: 1-503
Signature
class MockLLM(CustomLLM):
max_tokens: Optional[int]
def __init__(
self,
max_tokens: Optional[int] = None,
callback_manager: Optional[CallbackManager] = None,
system_prompt: Optional[str] = None,
messages_to_prompt: Optional[MessagesToPromptType] = None,
completion_to_prompt: Optional[CompletionToPromptType] = None,
pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT,
) -> None: ...
class MockLLMWithChatMemoryOfLastCall(MockLLM):
last_chat_messages: Optional[Sequence[ChatMessage]]
last_called_chat_function: List[str]
...
class MockFunctionCallingLLM(FunctionCallingLLM):
tool_calls: List[ToolCallBlock]
blocks_to_content_callback: BlockToContentCallback
def __init__(
self,
callback_manager: Optional[CallbackManager] = None,
system_prompt: Optional[str] = None,
messages_to_prompt: Optional[MessagesToPromptType] = None,
completion_to_prompt: Optional[CompletionToPromptType] = None,
pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT,
blocks_to_content_callback: Optional[BlockToContentCallback] = None,
response_generator: Optional[ResponseGenerator] = None,
**kwargs: Any,
) -> None: ...
Import
from llama_index.core.llms.mock import MockLLM
from llama_index.core.llms.mock import MockLLMWithChatMemoryOfLastCall
from llama_index.core.llms.mock import MockFunctionCallingLLM
I/O Contract
Inputs (MockLLM)
| Name | Type | Required | Description |
|---|---|---|---|
| max_tokens | Optional[int] | No | Maximum number of tokens to generate; if None the prompt is echoed back |
| callback_manager | Optional[CallbackManager] | No | Callback manager for LLM event hooks |
| system_prompt | Optional[str] | No | System prompt to prepend to conversations |
| messages_to_prompt | Optional[MessagesToPromptType] | No | Function to convert messages to a prompt string |
| completion_to_prompt | Optional[CompletionToPromptType] | No | Function to convert a completion string to a prompt |
| pydantic_program_mode | PydanticProgramMode | No | Mode for structured output generation (default: DEFAULT) |
Inputs (MockFunctionCallingLLM)
| Name | Type | Required | Description |
|---|---|---|---|
| callback_manager | Optional[CallbackManager] | No | Callback manager for LLM event hooks |
| system_prompt | Optional[str] | No | System prompt to prepend to conversations |
| blocks_to_content_callback | Optional[BlockToContentCallback] | No | Custom callback for converting content blocks to string |
| response_generator | Optional[ResponseGenerator] | No | Custom function to generate mock chat responses from messages |
Outputs
| Name | Type | Description |
|---|---|---|
| complete() | CompletionResponse | Returns a completion response with generated or echoed text |
| stream_complete() | CompletionResponseGen | Yields streaming completion response chunks |
| chat() | ChatResponse | Returns a chat response (MockFunctionCallingLLM) |
| get_tool_calls_from_response() | List[ToolSelection] | Extracts tool call selections from a chat response |
Usage Examples
Basic Usage
from llama_index.core.llms.mock import MockLLM
# Echo mode: returns the prompt as the response
llm = MockLLM()
response = llm.complete("What is the meaning of life?")
print(response.text) # "What is the meaning of life?"
# Token generation mode: generates repeated "text" tokens
llm = MockLLM(max_tokens=5)
response = llm.complete("ignored prompt")
print(response.text) # "text text text text text"
Using MockLLMWithChatMemoryOfLastCall
from llama_index.core.llms.mock import MockLLMWithChatMemoryOfLastCall
from llama_index.core.base.llms.types import ChatMessage
llm = MockLLMWithChatMemoryOfLastCall()
messages = [ChatMessage(role="user", content="Hello")]
llm.chat(messages)
# Inspect what messages were sent
assert llm.last_chat_messages is not None
assert llm.last_called_chat_function == ["chat"]
Using MockFunctionCallingLLM
from llama_index.core.llms.mock import MockFunctionCallingLLM
llm = MockFunctionCallingLLM()
response = llm.chat([ChatMessage(role="user", content="Call a tool")])
tool_calls = llm.get_tool_calls_from_response(response)