Principle:Microsoft Agent framework Agent Execution

Property	Value
Principle Name	Agent Execution
SDK	Microsoft Agent Framework
Repository	Microsoft Agent Framework
Source Reference	`python/packages/core/agent_framework/_agents.py:L827-868`
Import	`from agent_framework import Agent`
Domains	Agent_Architecture, NLP

Overview

Agent Execution is a pattern for sending messages to an AI agent and receiving responses, supporting both synchronous and streaming execution modes.

Description

Agent Execution is the core interaction pattern where a user query is sent to an agent, which processes it through the LLM, automatically invokes any required tools, and returns a response. The pattern supports multi-turn conversations via threads, allowing the agent to maintain conversational context across successive invocations.

The execution model provides two primary modes:

Non-streaming (default): The caller awaits a single AgentResponse object containing the complete response text, message history, and any structured output value.
Streaming: When stream=True is passed, the method returns a ResponseStream that yields incremental AgentResponseUpdate objects as the LLM generates tokens, enabling real-time display of partial results.

Multi-Turn Conversations

The AgentThread object serves as a conversation container. When passed to successive run() calls, it preserves the full message history, enabling the agent to reference prior exchanges. Each call appends the new user message and the agent's response to the thread.

Tool Invocation

During execution, if the LLM determines that a tool call is needed to satisfy the user's request, the agent automatically invokes the appropriate tool, incorporates the tool's output into the conversation, and continues generating the response. This tool invocation loop repeats until the LLM produces a final text response without further tool calls.

Per-Call Overrides

The run() method accepts optional tools and options parameters that override the agent's default configuration for a single invocation. This allows callers to dynamically adjust tool availability or model options without modifying the agent instance.

Theoretical Basis

The run() method implements a request-response cycle with an automatic tool invocation loop. The agent sends the user's message (along with conversation history and tool schemas) to the LLM, processes the response, and if the response includes tool calls, executes them and re-invokes the LLM with the updated context. This cycle continues until the LLM produces a final answer.

Setting stream=True enables real-time token streaming, where partial response tokens are yielded as they are generated by the LLM, rather than waiting for the complete response. This is implemented via an asynchronous iterator pattern that yields AgentResponseUpdate objects.

The dual return type (AgentResponse for non-streaming, ResponseStream for streaming) follows the Strategy Pattern, where the execution strategy is selected at call time via the stream parameter.

Related Pages

Implementation:Microsoft_Agent_framework_Agent_Run

Sources

Type	Name	URL
Repo	Microsoft Agent Framework	https://github.com/microsoft/agent-framework

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment