Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Microsoft Autogen Playground Testing

From Leeroopedia
Knowledge Sources
Domains Agent Frameworks, Real-time Streaming, WebSocket Communication, Interactive Testing
Last Updated 2026-02-11 00:00 GMT

Overview

Testing multi-agent teams interactively through a web-based playground that streams agent messages in real time over WebSocket connections, enabling developers to observe, interact with, and cancel running agent conversations.

Description

Playground testing provides a real-time, interactive environment for exercising multi-agent teams. Rather than running teams in batch mode and inspecting results after completion, the playground streams every agent message, tool call, and model event to the browser as it occurs. This live feedback loop is essential for understanding agent behavior, debugging conversation flow, and tuning system prompts.

The playground testing pattern involves several interacting concerns:

WebSocket Lifecycle: A persistent, bidirectional WebSocket connection is established between the browser and the server for each test run. The connection supports multiple message types: start (begin execution), stop (cancel execution), ping/pong (keep-alive), and input_response (human-in-the-loop input). This bidirectional channel enables both server-push streaming and client-initiated actions during execution.

Stream Orchestration: When a test run begins, the server instantiates a TeamManager, creates a cancellation token, and begins streaming the team execution. Each agent message, tool call event, LLM call event, and final result is formatted and pushed to the client through the WebSocket. The stream orchestrator also handles run lifecycle management: updating run status in the database, saving individual messages, and recording the final result.

Human-in-the-Loop: If the team includes a UserProxyAgent, the playground supports interactive input. When the agent requests user input, an input_request message is sent to the browser. The user types a response, which is sent back as an input_response message and injected into the running conversation. This is implemented through per-run async queues that bridge the WebSocket input with the agent's input function.

Cancellation: Any running test can be cancelled by the user. The cancellation token propagates through the async execution chain, stopping the team execution gracefully and updating the run status to STOPPED.

Authentication: The WebSocket endpoint supports optional authentication. When enabled, the client must authenticate after connection establishment, and the server verifies that the authenticated user owns the run being accessed.

Usage

Use playground testing when:

  • You need to observe agent conversations in real time as they execute
  • You want to interactively test teams that include human-in-the-loop agents
  • You need to debug agent behavior by watching message flow live
  • You want the ability to cancel long-running or misbehaving agent executions
  • You are developing and iterating on system prompts and team configurations

Theoretical Basis

The playground testing pattern follows the Observer Pattern combined with Event Streaming, where the browser acts as a subscriber to the stream of agent events:

WEBSOCKET LIFECYCLE:

1. Client opens WebSocket to /ws/runs/{run_id}
2. Server verifies run exists and is in CREATED/ACTIVE state
3. Server accepts connection, sends "connected" system message
4. (Optional) Authentication handshake
5. Client sends {"type": "start", "task": "...", "team_config": {...}}
6. Server creates async task for team execution stream

MESSAGE STREAMING LOOP:

    TeamManager.run_stream()
        |
        v
    for each message in stream:
        +-- TextMessage      --> {"type": "message", "data": {...}}
        +-- ToolCallRequest  --> {"type": "message", "data": {...}}
        +-- ToolCallExec     --> {"type": "message", "data": {...}}
        +-- MultiModalMsg    --> {"type": "message", "data": {...}} (images base64)
        +-- StreamingChunk   --> {"type": "message_chunk", "data": {...}}
        +-- TeamResult       --> {"type": "result", "data": {...}, "status": "complete"}
        |
        +-- Save to database
        +-- Send to WebSocket

BIDIRECTIONAL MESSAGES:

    Client -> Server:
        start       : Begin team execution
        stop        : Cancel execution
        ping        : Keep-alive
        input_response : Human-in-the-loop reply

    Server -> Client:
        connected     : Connection established
        message       : Agent message event
        message_chunk : Streaming token chunk
        result        : Final team result
        input_request : Request human input
        error         : Error notification

The pattern separates the transport layer (WebSocket management, connection tracking, message serialization) from the execution layer (TeamManager, team loading, agent execution). This separation allows the same team execution logic to be used both in the playground (WebSocket streaming) and in programmatic APIs (direct async generator consumption).

The use of CancellationToken provides cooperative cancellation, where the running team periodically checks whether cancellation has been requested and winds down gracefully rather than being forcefully terminated.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment