Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Microsoft Agent framework Agent Execution

From Leeroopedia
Property Value
Principle Name Agent Execution
SDK Microsoft Agent Framework
Repository Microsoft Agent Framework
Source Reference python/packages/core/agent_framework/_agents.py:L827-868
Import from agent_framework import Agent
Domains Agent_Architecture, NLP

Overview

Agent Execution is a pattern for sending messages to an AI agent and receiving responses, supporting both synchronous and streaming execution modes.

Description

Agent Execution is the core interaction pattern where a user query is sent to an agent, which processes it through the LLM, automatically invokes any required tools, and returns a response. The pattern supports multi-turn conversations via threads, allowing the agent to maintain conversational context across successive invocations.

The execution model provides two primary modes:

  1. Non-streaming (default): The caller awaits a single AgentResponse object containing the complete response text, message history, and any structured output value.
  2. Streaming: When stream=True is passed, the method returns a ResponseStream that yields incremental AgentResponseUpdate objects as the LLM generates tokens, enabling real-time display of partial results.

Multi-Turn Conversations

The AgentThread object serves as a conversation container. When passed to successive run() calls, it preserves the full message history, enabling the agent to reference prior exchanges. Each call appends the new user message and the agent's response to the thread.

Tool Invocation

During execution, if the LLM determines that a tool call is needed to satisfy the user's request, the agent automatically invokes the appropriate tool, incorporates the tool's output into the conversation, and continues generating the response. This tool invocation loop repeats until the LLM produces a final text response without further tool calls.

Per-Call Overrides

The run() method accepts optional tools and options parameters that override the agent's default configuration for a single invocation. This allows callers to dynamically adjust tool availability or model options without modifying the agent instance.

Theoretical Basis

The run() method implements a request-response cycle with an automatic tool invocation loop. The agent sends the user's message (along with conversation history and tool schemas) to the LLM, processes the response, and if the response includes tool calls, executes them and re-invokes the LLM with the updated context. This cycle continues until the LLM produces a final answer.

Setting stream=True enables real-time token streaming, where partial response tokens are yielded as they are generated by the LLM, rather than waiting for the complete response. This is implemented via an asynchronous iterator pattern that yields AgentResponseUpdate objects.

The dual return type (AgentResponse for non-streaming, ResponseStream for streaming) follows the Strategy Pattern, where the execution strategy is selected at call time via the stream parameter.

Related Pages

Sources

Type Name URL
Repo Microsoft Agent Framework https://github.com/microsoft/agent-framework

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment