Heuristic:Microsoft Autogen Agent Thread Safety
| Knowledge Sources | |
|---|---|
| Domains | Multi_Agent_Systems, Debugging, Concurrency |
| Last Updated | 2026-02-11 18:00 GMT |
Overview
AutoGen agents are not thread-safe or coroutine-safe — never share an agent instance between concurrent tasks, coroutines, or threads.
Description
The AssistantAgent (and by extension all agents inheriting from BaseChatAgent) maintains internal mutable state between method calls: model context (conversation history), tool state, and pending handoff data. This state is not protected by locks or atomics, making concurrent access unsafe. Additionally, the caller must only pass new messages on each call — the agent internally tracks the full conversation history.
Usage
Use this heuristic whenever you are:
- Running multiple agent tasks in an asyncio application
- Using agents in a web server that handles concurrent requests
- Attempting to reuse an agent across multiple team runs without resetting
- Building custom orchestration that might invoke the same agent from multiple coroutines
The Insight (Rule of Thumb)
- Action: Create a separate agent instance per concurrent task or coroutine. Never share agent instances.
- Value: One agent instance per logical conversation.
- Trade-off: Slightly higher memory usage from multiple instances, but this is negligible compared to model inference costs.
- Corollary: Only pass new messages to `on_messages` / `run` — the agent maintains its own history.
import asyncio
from autogen_agentchat.agents import AssistantAgent
# WRONG: Sharing one agent across concurrent tasks
agent = AssistantAgent(name="shared", model_client=client)
await asyncio.gather(
agent.run(task="Task A"), # Race condition!
agent.run(task="Task B"), # Race condition!
)
# CORRECT: Separate agent per task
agent_a = AssistantAgent(name="agent_a", model_client=client)
agent_b = AssistantAgent(name="agent_b", model_client=client)
await asyncio.gather(
agent_a.run(task="Task A"),
agent_b.run(task="Task B"),
)
Reasoning
From the AssistantAgent docstring at `_assistant_agent.py:111-120`:
# The caller must only pass the new messages to the agent on each call
# to the on_messages, on_messages_stream, run, or run_stream methods.
# The agent maintains its state between calls to these methods.
# Do not pass the entire conversation history to the agent on each call.
# WARNING: The assistant agent is not thread-safe or coroutine-safe.
# It should not be shared between multiple tasks or coroutines, and it
# should not call its methods concurrently.
The agent stores conversation history in a `ChatCompletionContext` object, tracks tool iteration counts, and manages handoff state. None of these are guarded by synchronization primitives. Concurrent access can lead to:
- Duplicated or missing messages in conversation history
- Tool call results being attributed to the wrong invocation
- Handoff state corruption where the wrong target agent is selected
- Model context growing incorrectly, leading to token limit errors
The group chat runtime handles this correctly via `SequentialRoutedAgent` (from `_sequential_routed_agent.py:38-71`) which ensures certain message types are processed sequentially in FIFO order, preventing race conditions at the team level.