Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Apache Dolphinscheduler Datasource Cache Expiry

From Leeroopedia




Knowledge Sources
Domains Optimization, Data_Integration
Last Updated 2026-02-10 10:00 GMT

Overview

Guava cache for pooled datasource clients expires after 24 hours with a maximum of 100 entries, aligned with Kerberos ticket lifetimes.

Description

`DataSourceClientProvider` maintains a Guava `Cache` of `PooledDataSourceClient` instances, keyed by a unique datasource identifier (composed of DB type, username, encoded password, and JDBC URL). The cache is configured with 24-hour expireAfterWrite (configurable via `kerberos.expire.time` property) and a maximum size of 100 entries. When entries are evicted (either by TTL or size limit), a removal listener automatically closes the underlying HikariCP datasource pool to prevent connection leaks.

Usage

Apply this heuristic when managing many datasource connections or using Kerberos authentication. The 24-hour expiry aligns with typical Kerberos ticket renewal cycles. If you have more than 100 unique datasources, the least-recently-used entries will be evicted and their pools closed. Tune `kerberos.expire.time` for shorter or longer cache lifetimes.

The Insight (Rule of Thumb)

  • Action: Configure `kerberos.expire.time` (in hours) to adjust cache TTL. Default is 24 hours.
  • Value: TTL = 24 hours, max entries = 100.
  • Trade-off: Shorter TTL forces more frequent pool recreation (higher latency, fresh Kerberos tickets). Longer TTL risks using stale connections or expired Kerberos tickets. Max 100 entries means the 101st unique datasource evicts the oldest.

Reasoning

Kerberos tickets typically have a 24-hour lifetime. By aligning the datasource cache expiry with this window, DolphinScheduler ensures that cached connections always have valid authentication credentials. The removal listener pattern guarantees that evicted datasource clients properly close their HikariCP pools, preventing connection leaks and database connection exhaustion.

Code evidence from `DataSourceClientProvider.java:47-59`:

// We use the cache here to avoid creating a new datasource client every time,
// One DataSourceClient corresponds to one unique datasource.
private static final Cache<String, PooledDataSourceClient> POOLED_DATASOURCE_CLIENT_CACHE =
        CacheBuilder.newBuilder()
                .expireAfterWrite(PropertyUtils.getLong(DataSourceConstants.KERBEROS_EXPIRE_TIME, 24L),
                        TimeUnit.HOURS)
                .removalListener((RemovalListener<String, PooledDataSourceClient>) notification -> {
                    try (PooledDataSourceClient closedClient = notification.getValue()) {
                        log.info("Datasource: {} is removed from cache due to expire", notification.getKey());
                    } catch (Exception e) {
                        log.error("Close datasource client error", e);
                    }
                })
                .maximumSize(100)
                .build();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment