Workflow:OpenHands OpenHands Conversation Lifecycle Management
| Knowledge Sources | |
|---|---|
| Domains | Container_Orchestration, AI_Agents, Distributed_Systems |
| Last Updated | 2026-02-11 21:00 GMT |
Overview
End-to-end process for creating, managing, and closing agent conversations in remote sandboxed containers with distributed state coordination across a multi-server cluster.
Description
This workflow covers the full lifecycle of an OpenHands agent conversation in the SaaS environment. It begins when a user or integration triggers a new conversation, provisions a remote runtime container, configures the nested agent server with settings and secrets, starts the agent loop, manages distributed state via Redis for multi-server routing, and handles graceful shutdown. The system supports both user-initiated (GUI/CLI) and integration-initiated (GitHub, Slack, etc.) conversations.
Usage
Execute this workflow whenever a new agent conversation needs to be created in the OpenHands Cloud platform, whether initiated by a user through the web interface, the CLI, or triggered by an external integration webhook.
Execution Steps
Step 1: Conversation Initiation
A request arrives to start a new agent conversation. The system checks Redis to determine if a conversation with this ID is already starting or running. If not, it acquires the conversation slot by setting a Redis key with the pattern ohcnv:{user_id}:{conversation_id} and launches a background task for the startup sequence.
Key considerations:
- Redis SET with NX flag ensures only one server can start a given conversation
- Returns AgentLoopInfo with status RUNNING, STARTING, or STOPPED
- Enforces per-user conversation count limits
Step 2: Remote Runtime Provisioning
Create a remote runtime container to host the agent loop. The system constructs a Session, Agent, and RemoteRuntime configuration, then calls the remote runtime API to provision an isolated Docker container. The container is configured with standalone conversation management, JSON logging, and disabled frontend serving.
Key considerations:
- Runtime URL pattern supports both subdomain and path-based routing
- Environment variables configure the nested runtime (RUNTIME=local, SERVE_FRONTEND=0)
- Workspace is mounted at /workspace/project
- Event retrieval mode can be WEBHOOK_PUSH, POLLING, or NONE
Step 3: Runtime Connection and Token Refresh
Connect to the provisioned runtime and refresh authentication tokens. The system calls runtime.connect() to establish communication, then refreshes the user's identity provider tokens. Token refresh handles two scenarios: direct IDP user ID lookup or offline token fallback via TokenManager.
Key considerations:
- Token refresh occurs after runtime initialization to use the latest credentials
- Supports both IDP-direct and offline token paths
- Session API key is extracted from runtime headers for subsequent API calls
Step 4: Nested Server Configuration
Configure the agent server running inside the container through a sequence of HTTP API calls. This sets up user settings, git provider tokens, custom secrets, and experiment configuration before creating the conversation itself.
Configuration sequence:
- POST /api/settings: Apply user preferences and LLM configuration
- POST /api/add-git-providers: Inject GitHub, GitLab, or other provider tokens
- POST /api/secrets: Add user-defined custom secrets (handles duplicates gracefully)
- POST /api/conversations/{sid}/exp-config: Set A/B experiment variant configuration
Step 5: Conversation Creation and Readiness
Create the conversation within the nested runtime by posting to the conversations API, then poll for readiness. The system sends up to 5 polling requests to the events endpoint to confirm the conversation has initialized and is accepting events.
Key considerations:
- Conversation creation includes initial user message and replay JSON if provided
- Readiness polling checks /api/conversations/{sid}/events
- The agent loop begins executing once the conversation is ready
Step 6: Distributed State Management
In a multi-server cluster deployment, maintain distributed state via Redis. Each server refreshes its conversation and connection keys every 5 seconds with a 15-second TTL. A Redis pub/sub channel (session_msg) routes events, close requests, and LLM completion requests between servers.
Message types routed via Redis:
- event: Forward events to conversations on other servers
- close_session: Broadcast session closure across all nodes
- session_closing: Graceful shutdown notification
- llm_completion: Cross-server LLM request routing
- llm_completion_response: Return LLM results to requesting server
Step 7: Session Closure and Cleanup
Close the conversation by posting to the runtime's pause endpoint. The system detects disconnected conversations (no active connections after a configurable delay) and conversations whose hosting server has failed (stale Redis keys). Redis state is cleaned up and the remote container is terminated.
Key considerations:
- Stale conversation detection uses Redis key TTL expiration
- Cross-server failure detection identifies orphaned conversations
- Database is queried for user_id to enable proper cleanup notifications
- Redis keys are cleaned up in the finally block of the startup sequence