Principle:Microsoft Autogen Team Deployment
| Knowledge Sources | |
|---|---|
| Domains | Agent Frameworks, Deployment, Streaming Execution, Environment Configuration |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Deploying and executing multi-agent teams programmatically or as lightweight web services, with streaming output, environment variable injection, and support for both foreground and background execution modes.
Description
Team deployment is the process of taking a team configuration (defined as code, JSON, YAML, or a database record) and executing it against a task in a production-ready manner. This goes beyond simple team execution by adding concerns that matter in deployed environments: streaming output for real-time monitoring, environment variable injection for API keys and secrets, cancellation support for graceful shutdown, and multiple deployment modes for different operational contexts.
The deployment pattern supports two primary modes:
Programmatic Execution: A team manager loads a team configuration from any supported source (file path, dictionary, ComponentModel), instantiates the team and its agents, injects environment variables, and runs the team with streaming output. The caller receives an async generator that yields individual agent messages and a final result, enabling real-time processing of team output without waiting for completion.
Lightweight Web Service: A lite studio instance wraps a single team configuration in a minimal web interface. It handles team loading (from files, dictionaries, ComponentModel objects, or arbitrary serializable objects), environment setup, and uvicorn server management. The lite studio can run in the foreground (blocking, for CLI use) or background (non-blocking, for programmatic or notebook use), and supports context manager usage for automatic cleanup.
Both modes share a common pattern for team loading: accept multiple input formats (file paths, dictionaries, ComponentModel, or raw team objects), normalize them to a common representation, and use the component loading system to instantiate the team. This flexibility means the same deployment code works whether the team was defined in code, exported from the studio UI, or loaded from a database.
Environment variable injection is a critical deployment concern because agent teams typically require API keys for model clients (OpenAI, Anthropic, Azure) and tools (search APIs, code execution). The deployment layer accepts a list of environment variables and sets them in the process environment before team instantiation, ensuring that model clients and tools can access their required credentials.
Usage
Use team deployment when:
- You need to run a team configuration programmatically with streaming output
- You want to deploy a single team as a lightweight web service for testing or demonstration
- You need to inject API keys and other secrets into the team execution environment
- You want to run teams in background threads or as part of a larger application
- You need to programmatically monitor team execution progress in real time
Theoretical Basis
Team deployment follows the Factory Pattern for team instantiation combined with Async Generator streaming for real-time output:
TEAM MANAGER EXECUTION FLOW:
1. Receive team_config (str | Path | Dict | ComponentModel)
2. Normalize config:
a. If file path: load JSON/YAML via aiofiles
b. If dict: use directly
c. If ComponentModel: call model_dump()
3. Inject environment variables (API keys, etc.)
4. Load team via BaseGroupChat.load_component(config)
5. Wire up UserProxyAgent input_func if present
6. Execute team.run_stream(task, cancellation_token)
7. For each message:
a. If TaskResult: wrap in TeamResult with duration
b. Otherwise: yield raw message
c. Also yield any queued LLMCallEvent messages
8. Cleanup: close all agent resources
LITE STUDIO DEPLOYMENT FLOW:
1. Receive team (str | Path | Dict | ComponentModel | None)
2. Normalize to JSON file path:
a. If None: create default team, save to temp file
b. If file path: verify exists, use as-is
c. If dict/ComponentModel: serialize to temp JSON file
d. If object with dump_component(): serialize via that method
3. Setup environment variables for lite mode:
- LITE_MODE=true, in-memory DB, auth disabled
- LITE_TEAM_FILE=path to team JSON
4. Start uvicorn server:
a. Foreground (blocking): direct uvicorn.run()
b. Background (threaded): daemon thread with uvicorn.run()
5. Optionally auto-open browser to /lite endpoint
The AsyncGenerator pattern for streaming is fundamental to the deployment architecture. Rather than collecting all messages and returning them at the end, the team manager yields messages as they are produced. This enables several important capabilities:
- Real-time monitoring: Consumers can process messages as they arrive
- Early termination: Consumers can stop consuming and trigger cancellation
- Resource efficiency: Messages do not accumulate in memory
- Composability: The generator can be consumed by different transports (WebSocket, SSE, logging, etc.)
The LiteStudio class follows the Context Manager Protocol (supports with statement), providing automatic server lifecycle management. This is particularly useful in Jupyter notebooks where starting a background server, running experiments, and then cleaning up should be straightforward.