Heuristic:OpenHands OpenHands Redis Distributed Locking

Knowledge Sources	OpenHands Clustered conversation manager
Domains	Distributed_Systems, Conversation_Management
Last Updated	2026-02-11 21:00 GMT

Overview

Use Redis `SET key value NX EX ttl` for distributed mutual exclusion without explicit locks, ensuring only one server instance manages a given conversation.

Description

In the clustered conversation manager, OpenHands uses Redis atomic set-if-not-exists operations as lightweight distributed locks. Instead of using a formal distributed lock library (like Redlock), the system uses `redis.set(key, 1, nx=True, ex=timeout)` where `nx=True` means "set only if key does not exist" and `ex=timeout` sets an automatic expiry. This provides exactly-once semantics for conversation assignment: the first server to claim a conversation wins, and all others see that the key already exists. The expiry ensures that locks are automatically released if a server crashes without cleanup.

Usage

Apply this pattern when you need distributed mutual exclusion across multiple OpenHands server instances. Key use cases include:

Claiming ownership of a conversation (only one server should run the agent loop)
Preventing duplicate webhook processing (idempotency)
Rate limiting across multiple server instances

The Insight (Rule of Thumb)

Action: Use `redis.set(key, value, nx=True, ex=ttl)` instead of explicit distributed locks for simple mutual exclusion.
Value: TTL of 15 seconds for conversation ownership (refreshed every 5 seconds); TTL of 60 seconds for webhook deduplication.
Trade-off: Simpler than Redlock but does not handle clock drift or split-brain scenarios. Acceptable for this use case because conversation reassignment on edge failures is handled by recovery logic.
Key Pattern: Use structured Redis keys like `ohcnv:{user_id}:{conversation_id}` for conversation ownership and `ohcnct:{user_id}:{conversation_id}:{connection_id}` for connection tracking.

Reasoning

The `NX+EX` pattern is preferred over formal distributed lock implementations because:

Simplicity: A single Redis command replaces complex lock acquisition/release logic.
Automatic cleanup: The `EX` expiry ensures locks are released even if the owning server crashes, preventing deadlocks.
No coordination overhead: Unlike Redlock (which requires multiple Redis instances), this works with a single Redis instance.
Sufficient for this use case: Conversation management can tolerate brief periods where a conversation is unowned (between expiry and re-claim), because the recovery logic detects and handles this.
Refresh pattern: The owner refreshes the key every 5 seconds against a 15-second TTL, providing a 10-second safety margin before the lock expires.

Code evidence from `enterprise/server/clustered_conversation_manager.py:397-400`:

# If we can set the key in redis then no other worker is running this conversation
redis = self._get_redis_client()
key = self._get_redis_conversation_key(user_id, sid)
created = await redis.set(key, 1, nx=True, ex=_REDIS_ENTRY_TIMEOUT_SECONDS)

Redis key constants from `enterprise/server/clustered_conversation_manager.py:40-49`:

# Time in seconds between cleanup operations for stale conversations
_CLEANUP_INTERVAL_SECONDS = 15

# Time in seconds before a Redis entry is considered expired if not refreshed
_REDIS_ENTRY_TIMEOUT_SECONDS = 15

# Time in seconds between updates to Redis entries
_REDIS_UPDATE_INTERVAL_SECONDS = 5

_REDIS_POLL_TIMEOUT = 0.15

Webhook deduplication from `enterprise/server/routes/integration/gitlab.py:95-104`:

dedup_key = object_attributes.get('id')
if not dedup_key:
    dedup_json = json.dumps(payload_data, sort_keys=True)
    dedup_hash = hashlib.sha256(dedup_json.encode()).hexdigest()
    dedup_key = f'gitlab_msg: {dedup_hash}'
created = await redis.set(dedup_key, 1, nx=True, ex=60)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment