Environment:Dagster io Dagster GRPC Communication
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Networking |
| Last Updated | 2026-02-10 12:00 GMT |
Overview
gRPC inter-process communication environment with configurable timeouts, message sizes, and connection settings for Dagster's distributed architecture.
Description
Dagster uses gRPC (Google Remote Procedure Call) as its primary inter-process communication mechanism. The webserver, daemon, and code server processes communicate via gRPC to load definitions, execute runs, and evaluate sensors/schedules. This environment defines the tunable parameters for gRPC communication including timeout values, maximum message sizes, and shutdown behavior. These settings are critical for large deployments with many assets or complex definitions that may exceed default limits.
Usage
Use this environment configuration when tuning Dagster for production deployments, especially when encountering timeout errors during code location loading, large gRPC message failures, or slow sensor/schedule evaluations. The defaults work well for typical deployments but may need adjustment for large-scale repositories with thousands of assets.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Network | TCP connectivity between processes | Default gRPC port or Unix socket |
| Python | grpcio >= 1.44.0 (or >= 1.66.2 for Python 3.13+) | Installed as core Dagster dependency |
Dependencies
Python Packages
grpcio>= 1.44.0 (or >= 1.66.2 for Python 3.13+)grpcio-health-checking>= 1.44.0 (or >= 1.66.2 for Python 3.13+)protobuf>= 3.20.0, < 7
Credentials
The following environment variables control gRPC behavior:
Message Size Limits:
DAGSTER_GRPC_MAX_RX_BYTES: Maximum receive message size (default: 100 MB = 100,000,000 bytes)DAGSTER_GRPC_MAX_SEND_BYTES: Maximum send message size (default: 100 MB)
Timeout Configuration:
DAGSTER_GRPC_TIMEOUT_SECONDS: Default gRPC call timeout (default: 60 seconds)DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS: Repository loading timeout (default: max(180, GRPC_TIMEOUT))DAGSTER_SCHEDULE_GRPC_TIMEOUT_SECONDS: Schedule evaluation timeout (default: same as GRPC_TIMEOUT)DAGSTER_SENSOR_GRPC_TIMEOUT_SECONDS: Sensor evaluation timeout (default: same as GRPC_TIMEOUT)DAGSTER_GRPC_SHUTDOWN_GRACE_PERIOD: Shutdown grace period (default: max of all timeouts)
Server Configuration:
DAGSTER_GRPC_SOCKET: Unix socket path for gRPC communicationDAGSTER_GRPC_PORT: TCP port for gRPC communicationDAGSTER_CODE_SERVER_AUTO_RESTART_INTERVAL: Code server restart interval in seconds (default: 30)DAGSTER_CODE_SERVER_LOG_EXCEPTIONS: Enable code server exception logging
Quick Install
# No additional installation - gRPC is a core Dagster dependency
# Increase timeout for large repositories (e.g., 5 minutes)
export DAGSTER_GRPC_TIMEOUT_SECONDS=300
export DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS=600
# Increase message size for very large definitions (500 MB)
export DAGSTER_GRPC_MAX_RX_BYTES=500000000
export DAGSTER_GRPC_MAX_SEND_BYTES=500000000
Code Evidence
Message size configuration from grpc/utils.py:83-98:
def max_rx_bytes() -> int:
env_set = os.getenv("DAGSTER_GRPC_MAX_RX_BYTES")
if env_set:
return int(env_set)
# default 100 MB
return 100 * (10**6)
def max_send_bytes() -> int:
env_set = os.getenv("DAGSTER_GRPC_MAX_SEND_BYTES")
if env_set:
return int(env_set)
# default 100 MB
return 100 * (10**6)
Timeout cascade from grpc/utils.py:101-147:
def default_grpc_timeout() -> int:
env_set = os.getenv("DAGSTER_GRPC_TIMEOUT_SECONDS")
if env_set:
return int(env_set)
return _DEFAULT_GRPC_TIMEOUT_IF_NO_ENV_VAR_SET # 60
def default_repository_grpc_timeout() -> int:
env_set = os.getenv("DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS")
if env_set:
return int(env_set)
return max(_DEFAULT_REPOSITORY_TIMEOUT_IF_NO_ENV_VAR_SET,
default_grpc_timeout()) # max(180, 60) = 180
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
grpc._channel._InactiveRpcError: DEADLINE_EXCEEDED |
gRPC timeout too short for operation | Increase DAGSTER_GRPC_TIMEOUT_SECONDS
|
Received message exceeds the maximum configured message size |
Repository definition too large | Increase DAGSTER_GRPC_MAX_RX_BYTES
|
Connection refused on code server |
Code server not started or wrong port | Check DAGSTER_GRPC_PORT or DAGSTER_GRPC_SOCKET
|
Compatibility Notes
- Repository loading: Has a higher default timeout (180s vs 60s for general calls) because loading large repositories with many assets can be slow.
- Shutdown grace period: Defaults to the maximum of all configured timeouts to ensure in-flight calls complete during graceful shutdown.
- Unix sockets vs TCP: Unix sockets (
DAGSTER_GRPC_SOCKET) provide better performance for same-machine communication. TCP (DAGSTER_GRPC_PORT) is required for cross-machine setups. - Code server auto-restart: The code server restarts every 30 seconds by default to pick up code changes. Adjust
DAGSTER_CODE_SERVER_AUTO_RESTART_INTERVALin production.