Principle:Microsoft Autogen Playground Testing
| Knowledge Sources | |
|---|---|
| Domains | Agent Frameworks, Real-time Streaming, WebSocket Communication, Interactive Testing |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Testing multi-agent teams interactively through a web-based playground that streams agent messages in real time over WebSocket connections, enabling developers to observe, interact with, and cancel running agent conversations.
Description
Playground testing provides a real-time, interactive environment for exercising multi-agent teams. Rather than running teams in batch mode and inspecting results after completion, the playground streams every agent message, tool call, and model event to the browser as it occurs. This live feedback loop is essential for understanding agent behavior, debugging conversation flow, and tuning system prompts.
The playground testing pattern involves several interacting concerns:
WebSocket Lifecycle: A persistent, bidirectional WebSocket connection is established between the browser and the server for each test run. The connection supports multiple message types: start (begin execution), stop (cancel execution), ping/pong (keep-alive), and input_response (human-in-the-loop input). This bidirectional channel enables both server-push streaming and client-initiated actions during execution.
Stream Orchestration: When a test run begins, the server instantiates a TeamManager, creates a cancellation token, and begins streaming the team execution. Each agent message, tool call event, LLM call event, and final result is formatted and pushed to the client through the WebSocket. The stream orchestrator also handles run lifecycle management: updating run status in the database, saving individual messages, and recording the final result.
Human-in-the-Loop: If the team includes a UserProxyAgent, the playground supports interactive input. When the agent requests user input, an input_request message is sent to the browser. The user types a response, which is sent back as an input_response message and injected into the running conversation. This is implemented through per-run async queues that bridge the WebSocket input with the agent's input function.
Cancellation: Any running test can be cancelled by the user. The cancellation token propagates through the async execution chain, stopping the team execution gracefully and updating the run status to STOPPED.
Authentication: The WebSocket endpoint supports optional authentication. When enabled, the client must authenticate after connection establishment, and the server verifies that the authenticated user owns the run being accessed.
Usage
Use playground testing when:
- You need to observe agent conversations in real time as they execute
- You want to interactively test teams that include human-in-the-loop agents
- You need to debug agent behavior by watching message flow live
- You want the ability to cancel long-running or misbehaving agent executions
- You are developing and iterating on system prompts and team configurations
Theoretical Basis
The playground testing pattern follows the Observer Pattern combined with Event Streaming, where the browser acts as a subscriber to the stream of agent events:
WEBSOCKET LIFECYCLE:
1. Client opens WebSocket to /ws/runs/{run_id}
2. Server verifies run exists and is in CREATED/ACTIVE state
3. Server accepts connection, sends "connected" system message
4. (Optional) Authentication handshake
5. Client sends {"type": "start", "task": "...", "team_config": {...}}
6. Server creates async task for team execution stream
MESSAGE STREAMING LOOP:
TeamManager.run_stream()
|
v
for each message in stream:
+-- TextMessage --> {"type": "message", "data": {...}}
+-- ToolCallRequest --> {"type": "message", "data": {...}}
+-- ToolCallExec --> {"type": "message", "data": {...}}
+-- MultiModalMsg --> {"type": "message", "data": {...}} (images base64)
+-- StreamingChunk --> {"type": "message_chunk", "data": {...}}
+-- TeamResult --> {"type": "result", "data": {...}, "status": "complete"}
|
+-- Save to database
+-- Send to WebSocket
BIDIRECTIONAL MESSAGES:
Client -> Server:
start : Begin team execution
stop : Cancel execution
ping : Keep-alive
input_response : Human-in-the-loop reply
Server -> Client:
connected : Connection established
message : Agent message event
message_chunk : Streaming token chunk
result : Final team result
input_request : Request human input
error : Error notification
The pattern separates the transport layer (WebSocket management, connection tracking, message serialization) from the execution layer (TeamManager, team loading, agent execution). This separation allows the same team execution logic to be used both in the playground (WebSocket streaming) and in programmatic APIs (direct async generator consumption).
The use of CancellationToken provides cooperative cancellation, where the running team periodically checks whether cancellation has been requested and winds down gracefully rather than being forcefully terminated.