Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index MockLLM

From Leeroopedia
Knowledge Sources
Domains LLM, Testing, Mock
Last Updated 2026-02-11 19:00 GMT

Overview

Provides mock LLM implementations (MockLLM, MockLLMWithChatMemoryOfLastCall, and MockFunctionCallingLLM) for use in testing and development without requiring actual LLM API calls.

Description

The mock.py module defines three mock LLM classes that extend the LlamaIndex LLM hierarchy for testing purposes:

  • MockLLM extends CustomLLM and provides a simple deterministic LLM that either echoes the input prompt or generates repeated "text" tokens up to a configurable max_tokens limit. It supports both complete and stream_complete methods.
  • MockLLMWithNonyieldingChatStream extends MockLLM and provides a stream_chat implementation that yields nothing, useful for testing empty stream handling.
  • MockLLMWithChatMemoryOfLastCall extends MockLLM and records the last set of ChatMessage objects passed to any chat method (chat, stream_chat, achat, astream_chat), enabling test assertions on the messages that would have been sent to a real LLM.
  • MockFunctionCallingLLM extends FunctionCallingLLM and simulates a function-calling LLM. It supports customizable response generation via a response_generator callback and a blocks_to_content_callback for converting multi-modal content blocks (text, images, audio, documents, video, thinking blocks, citations) into string content. It implements _prepare_chat_with_tools and get_tool_calls_from_response to simulate tool calling workflows.

The module also includes helper functions _data_from_binary for base64 decoding binary block data and _default_blocks_to_content_callback for serializing a list of ContentBlock objects into a single string.

Usage

Use these mock LLMs in unit tests, integration tests, and development scenarios where you want to verify pipeline behavior without incurring real LLM API costs or latency. MockLLM is suitable for simple completion testing, MockLLMWithChatMemoryOfLastCall for verifying chat message construction, and MockFunctionCallingLLM for testing tool-calling agent workflows.

Code Reference

Source Location

Signature

class MockLLM(CustomLLM):
    max_tokens: Optional[int]

    def __init__(
        self,
        max_tokens: Optional[int] = None,
        callback_manager: Optional[CallbackManager] = None,
        system_prompt: Optional[str] = None,
        messages_to_prompt: Optional[MessagesToPromptType] = None,
        completion_to_prompt: Optional[CompletionToPromptType] = None,
        pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT,
    ) -> None: ...

class MockLLMWithChatMemoryOfLastCall(MockLLM):
    last_chat_messages: Optional[Sequence[ChatMessage]]
    last_called_chat_function: List[str]
    ...

class MockFunctionCallingLLM(FunctionCallingLLM):
    tool_calls: List[ToolCallBlock]
    blocks_to_content_callback: BlockToContentCallback

    def __init__(
        self,
        callback_manager: Optional[CallbackManager] = None,
        system_prompt: Optional[str] = None,
        messages_to_prompt: Optional[MessagesToPromptType] = None,
        completion_to_prompt: Optional[CompletionToPromptType] = None,
        pydantic_program_mode: PydanticProgramMode = PydanticProgramMode.DEFAULT,
        blocks_to_content_callback: Optional[BlockToContentCallback] = None,
        response_generator: Optional[ResponseGenerator] = None,
        **kwargs: Any,
    ) -> None: ...

Import

from llama_index.core.llms.mock import MockLLM
from llama_index.core.llms.mock import MockLLMWithChatMemoryOfLastCall
from llama_index.core.llms.mock import MockFunctionCallingLLM

I/O Contract

Inputs (MockLLM)

Name Type Required Description
max_tokens Optional[int] No Maximum number of tokens to generate; if None the prompt is echoed back
callback_manager Optional[CallbackManager] No Callback manager for LLM event hooks
system_prompt Optional[str] No System prompt to prepend to conversations
messages_to_prompt Optional[MessagesToPromptType] No Function to convert messages to a prompt string
completion_to_prompt Optional[CompletionToPromptType] No Function to convert a completion string to a prompt
pydantic_program_mode PydanticProgramMode No Mode for structured output generation (default: DEFAULT)

Inputs (MockFunctionCallingLLM)

Name Type Required Description
callback_manager Optional[CallbackManager] No Callback manager for LLM event hooks
system_prompt Optional[str] No System prompt to prepend to conversations
blocks_to_content_callback Optional[BlockToContentCallback] No Custom callback for converting content blocks to string
response_generator Optional[ResponseGenerator] No Custom function to generate mock chat responses from messages

Outputs

Name Type Description
complete() CompletionResponse Returns a completion response with generated or echoed text
stream_complete() CompletionResponseGen Yields streaming completion response chunks
chat() ChatResponse Returns a chat response (MockFunctionCallingLLM)
get_tool_calls_from_response() List[ToolSelection] Extracts tool call selections from a chat response

Usage Examples

Basic Usage

from llama_index.core.llms.mock import MockLLM

# Echo mode: returns the prompt as the response
llm = MockLLM()
response = llm.complete("What is the meaning of life?")
print(response.text)  # "What is the meaning of life?"

# Token generation mode: generates repeated "text" tokens
llm = MockLLM(max_tokens=5)
response = llm.complete("ignored prompt")
print(response.text)  # "text text text text text"

Using MockLLMWithChatMemoryOfLastCall

from llama_index.core.llms.mock import MockLLMWithChatMemoryOfLastCall
from llama_index.core.base.llms.types import ChatMessage

llm = MockLLMWithChatMemoryOfLastCall()
messages = [ChatMessage(role="user", content="Hello")]
llm.chat(messages)

# Inspect what messages were sent
assert llm.last_chat_messages is not None
assert llm.last_called_chat_function == ["chat"]

Using MockFunctionCallingLLM

from llama_index.core.llms.mock import MockFunctionCallingLLM

llm = MockFunctionCallingLLM()
response = llm.chat([ChatMessage(role="user", content="Call a tool")])
tool_calls = llm.get_tool_calls_from_response(response)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment