Workflow:OpenHands OpenHands Conversation Lifecycle Management

Knowledge Sources	OpenHands OpenHands Docs
Domains	Container_Orchestration, AI_Agents, Distributed_Systems
Last Updated	2026-02-11 21:00 GMT

Overview

End-to-end process for creating, managing, and closing agent conversations in remote sandboxed containers with distributed state coordination across a multi-server cluster.

Description

This workflow covers the full lifecycle of an OpenHands agent conversation in the SaaS environment. It begins when a user or integration triggers a new conversation, provisions a remote runtime container, configures the nested agent server with settings and secrets, starts the agent loop, manages distributed state via Redis for multi-server routing, and handles graceful shutdown. The system supports both user-initiated (GUI/CLI) and integration-initiated (GitHub, Slack, etc.) conversations.

Usage

Execute this workflow whenever a new agent conversation needs to be created in the OpenHands Cloud platform, whether initiated by a user through the web interface, the CLI, or triggered by an external integration webhook.

Execution Steps

Step 1: Conversation Initiation

A request arrives to start a new agent conversation. The system checks Redis to determine if a conversation with this ID is already starting or running. If not, it acquires the conversation slot by setting a Redis key with the pattern ohcnv:{user_id}:{conversation_id} and launches a background task for the startup sequence.

Key considerations:

Redis SET with NX flag ensures only one server can start a given conversation
Returns AgentLoopInfo with status RUNNING, STARTING, or STOPPED
Enforces per-user conversation count limits

Step 2: Remote Runtime Provisioning

Create a remote runtime container to host the agent loop. The system constructs a Session, Agent, and RemoteRuntime configuration, then calls the remote runtime API to provision an isolated Docker container. The container is configured with standalone conversation management, JSON logging, and disabled frontend serving.

Key considerations:

Runtime URL pattern supports both subdomain and path-based routing
Environment variables configure the nested runtime (RUNTIME=local, SERVE_FRONTEND=0)
Workspace is mounted at /workspace/project
Event retrieval mode can be WEBHOOK_PUSH, POLLING, or NONE

Step 3: Runtime Connection and Token Refresh

Connect to the provisioned runtime and refresh authentication tokens. The system calls runtime.connect() to establish communication, then refreshes the user's identity provider tokens. Token refresh handles two scenarios: direct IDP user ID lookup or offline token fallback via TokenManager.

Key considerations:

Token refresh occurs after runtime initialization to use the latest credentials
Supports both IDP-direct and offline token paths
Session API key is extracted from runtime headers for subsequent API calls

Step 4: Nested Server Configuration

Configure the agent server running inside the container through a sequence of HTTP API calls. This sets up user settings, git provider tokens, custom secrets, and experiment configuration before creating the conversation itself.

Configuration sequence:

POST /api/settings: Apply user preferences and LLM configuration
POST /api/add-git-providers: Inject GitHub, GitLab, or other provider tokens
POST /api/secrets: Add user-defined custom secrets (handles duplicates gracefully)
POST /api/conversations/{sid}/exp-config: Set A/B experiment variant configuration

Step 5: Conversation Creation and Readiness

Create the conversation within the nested runtime by posting to the conversations API, then poll for readiness. The system sends up to 5 polling requests to the events endpoint to confirm the conversation has initialized and is accepting events.

Key considerations:

Conversation creation includes initial user message and replay JSON if provided
Readiness polling checks /api/conversations/{sid}/events
The agent loop begins executing once the conversation is ready

Step 6: Distributed State Management

In a multi-server cluster deployment, maintain distributed state via Redis. Each server refreshes its conversation and connection keys every 5 seconds with a 15-second TTL. A Redis pub/sub channel (session_msg) routes events, close requests, and LLM completion requests between servers.

Message types routed via Redis:

event: Forward events to conversations on other servers
close_session: Broadcast session closure across all nodes
session_closing: Graceful shutdown notification
llm_completion: Cross-server LLM request routing
llm_completion_response: Return LLM results to requesting server

Step 7: Session Closure and Cleanup

Close the conversation by posting to the runtime's pause endpoint. The system detects disconnected conversations (no active connections after a configurable delay) and conversations whose hosting server has failed (stale Redis keys). Redis state is cleaned up and the remote container is terminated.

Key considerations:

Stale conversation detection uses Redis key TTL expiration
Cross-server failure detection identifies orphaned conversations
Database is queried for user_id to enable proper cleanup notifications
Redis keys are cleaned up in the finally block of the startup sequence

Execution Diagram

GitHub URL

Workflow Repository