Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Microsoft Autogen Team Deployment

From Leeroopedia
Revision as of 18:19, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Microsoft_Autogen_Team_Deployment.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Agent Frameworks, Deployment, Streaming Execution, Environment Configuration
Last Updated 2026-02-11 00:00 GMT

Overview

Deploying and executing multi-agent teams programmatically or as lightweight web services, with streaming output, environment variable injection, and support for both foreground and background execution modes.

Description

Team deployment is the process of taking a team configuration (defined as code, JSON, YAML, or a database record) and executing it against a task in a production-ready manner. This goes beyond simple team execution by adding concerns that matter in deployed environments: streaming output for real-time monitoring, environment variable injection for API keys and secrets, cancellation support for graceful shutdown, and multiple deployment modes for different operational contexts.

The deployment pattern supports two primary modes:

Programmatic Execution: A team manager loads a team configuration from any supported source (file path, dictionary, ComponentModel), instantiates the team and its agents, injects environment variables, and runs the team with streaming output. The caller receives an async generator that yields individual agent messages and a final result, enabling real-time processing of team output without waiting for completion.

Lightweight Web Service: A lite studio instance wraps a single team configuration in a minimal web interface. It handles team loading (from files, dictionaries, ComponentModel objects, or arbitrary serializable objects), environment setup, and uvicorn server management. The lite studio can run in the foreground (blocking, for CLI use) or background (non-blocking, for programmatic or notebook use), and supports context manager usage for automatic cleanup.

Both modes share a common pattern for team loading: accept multiple input formats (file paths, dictionaries, ComponentModel, or raw team objects), normalize them to a common representation, and use the component loading system to instantiate the team. This flexibility means the same deployment code works whether the team was defined in code, exported from the studio UI, or loaded from a database.

Environment variable injection is a critical deployment concern because agent teams typically require API keys for model clients (OpenAI, Anthropic, Azure) and tools (search APIs, code execution). The deployment layer accepts a list of environment variables and sets them in the process environment before team instantiation, ensuring that model clients and tools can access their required credentials.

Usage

Use team deployment when:

  • You need to run a team configuration programmatically with streaming output
  • You want to deploy a single team as a lightweight web service for testing or demonstration
  • You need to inject API keys and other secrets into the team execution environment
  • You want to run teams in background threads or as part of a larger application
  • You need to programmatically monitor team execution progress in real time

Theoretical Basis

Team deployment follows the Factory Pattern for team instantiation combined with Async Generator streaming for real-time output:

TEAM MANAGER EXECUTION FLOW:

1. Receive team_config (str | Path | Dict | ComponentModel)
2. Normalize config:
   a. If file path: load JSON/YAML via aiofiles
   b. If dict: use directly
   c. If ComponentModel: call model_dump()
3. Inject environment variables (API keys, etc.)
4. Load team via BaseGroupChat.load_component(config)
5. Wire up UserProxyAgent input_func if present
6. Execute team.run_stream(task, cancellation_token)
7. For each message:
   a. If TaskResult: wrap in TeamResult with duration
   b. Otherwise: yield raw message
   c. Also yield any queued LLMCallEvent messages
8. Cleanup: close all agent resources

LITE STUDIO DEPLOYMENT FLOW:

1. Receive team (str | Path | Dict | ComponentModel | None)
2. Normalize to JSON file path:
   a. If None: create default team, save to temp file
   b. If file path: verify exists, use as-is
   c. If dict/ComponentModel: serialize to temp JSON file
   d. If object with dump_component(): serialize via that method
3. Setup environment variables for lite mode:
   - LITE_MODE=true, in-memory DB, auth disabled
   - LITE_TEAM_FILE=path to team JSON
4. Start uvicorn server:
   a. Foreground (blocking): direct uvicorn.run()
   b. Background (threaded): daemon thread with uvicorn.run()
5. Optionally auto-open browser to /lite endpoint

The AsyncGenerator pattern for streaming is fundamental to the deployment architecture. Rather than collecting all messages and returning them at the end, the team manager yields messages as they are produced. This enables several important capabilities:

  • Real-time monitoring: Consumers can process messages as they arrive
  • Early termination: Consumers can stop consuming and trigger cancellation
  • Resource efficiency: Messages do not accumulate in memory
  • Composability: The generator can be consumed by different transports (WebSocket, SSE, logging, etc.)

The LiteStudio class follows the Context Manager Protocol (supports with statement), providing automatic server lifecycle management. This is particularly useful in Jupyter notebooks where starting a background server, running experiments, and then cleaning up should be straightforward.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment