Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Dagster io Dagster GRPC Communication

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Networking
Last Updated 2026-02-10 12:00 GMT

Overview

gRPC inter-process communication environment with configurable timeouts, message sizes, and connection settings for Dagster's distributed architecture.

Description

Dagster uses gRPC (Google Remote Procedure Call) as its primary inter-process communication mechanism. The webserver, daemon, and code server processes communicate via gRPC to load definitions, execute runs, and evaluate sensors/schedules. This environment defines the tunable parameters for gRPC communication including timeout values, maximum message sizes, and shutdown behavior. These settings are critical for large deployments with many assets or complex definitions that may exceed default limits.

Usage

Use this environment configuration when tuning Dagster for production deployments, especially when encountering timeout errors during code location loading, large gRPC message failures, or slow sensor/schedule evaluations. The defaults work well for typical deployments but may need adjustment for large-scale repositories with thousands of assets.

System Requirements

Category Requirement Notes
Network TCP connectivity between processes Default gRPC port or Unix socket
Python grpcio >= 1.44.0 (or >= 1.66.2 for Python 3.13+) Installed as core Dagster dependency

Dependencies

Python Packages

  • grpcio >= 1.44.0 (or >= 1.66.2 for Python 3.13+)
  • grpcio-health-checking >= 1.44.0 (or >= 1.66.2 for Python 3.13+)
  • protobuf >= 3.20.0, < 7

Credentials

The following environment variables control gRPC behavior:

Message Size Limits:

  • DAGSTER_GRPC_MAX_RX_BYTES: Maximum receive message size (default: 100 MB = 100,000,000 bytes)
  • DAGSTER_GRPC_MAX_SEND_BYTES: Maximum send message size (default: 100 MB)

Timeout Configuration:

  • DAGSTER_GRPC_TIMEOUT_SECONDS: Default gRPC call timeout (default: 60 seconds)
  • DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS: Repository loading timeout (default: max(180, GRPC_TIMEOUT))
  • DAGSTER_SCHEDULE_GRPC_TIMEOUT_SECONDS: Schedule evaluation timeout (default: same as GRPC_TIMEOUT)
  • DAGSTER_SENSOR_GRPC_TIMEOUT_SECONDS: Sensor evaluation timeout (default: same as GRPC_TIMEOUT)
  • DAGSTER_GRPC_SHUTDOWN_GRACE_PERIOD: Shutdown grace period (default: max of all timeouts)

Server Configuration:

  • DAGSTER_GRPC_SOCKET: Unix socket path for gRPC communication
  • DAGSTER_GRPC_PORT: TCP port for gRPC communication
  • DAGSTER_CODE_SERVER_AUTO_RESTART_INTERVAL: Code server restart interval in seconds (default: 30)
  • DAGSTER_CODE_SERVER_LOG_EXCEPTIONS: Enable code server exception logging

Quick Install

# No additional installation - gRPC is a core Dagster dependency

# Increase timeout for large repositories (e.g., 5 minutes)
export DAGSTER_GRPC_TIMEOUT_SECONDS=300
export DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS=600

# Increase message size for very large definitions (500 MB)
export DAGSTER_GRPC_MAX_RX_BYTES=500000000
export DAGSTER_GRPC_MAX_SEND_BYTES=500000000

Code Evidence

Message size configuration from grpc/utils.py:83-98:

def max_rx_bytes() -> int:
    env_set = os.getenv("DAGSTER_GRPC_MAX_RX_BYTES")
    if env_set:
        return int(env_set)
    # default 100 MB
    return 100 * (10**6)

def max_send_bytes() -> int:
    env_set = os.getenv("DAGSTER_GRPC_MAX_SEND_BYTES")
    if env_set:
        return int(env_set)
    # default 100 MB
    return 100 * (10**6)

Timeout cascade from grpc/utils.py:101-147:

def default_grpc_timeout() -> int:
    env_set = os.getenv("DAGSTER_GRPC_TIMEOUT_SECONDS")
    if env_set:
        return int(env_set)
    return _DEFAULT_GRPC_TIMEOUT_IF_NO_ENV_VAR_SET  # 60

def default_repository_grpc_timeout() -> int:
    env_set = os.getenv("DAGSTER_REPOSITORY_GRPC_TIMEOUT_SECONDS")
    if env_set:
        return int(env_set)
    return max(_DEFAULT_REPOSITORY_TIMEOUT_IF_NO_ENV_VAR_SET,
               default_grpc_timeout())  # max(180, 60) = 180

Common Errors

Error Message Cause Solution
grpc._channel._InactiveRpcError: DEADLINE_EXCEEDED gRPC timeout too short for operation Increase DAGSTER_GRPC_TIMEOUT_SECONDS
Received message exceeds the maximum configured message size Repository definition too large Increase DAGSTER_GRPC_MAX_RX_BYTES
Connection refused on code server Code server not started or wrong port Check DAGSTER_GRPC_PORT or DAGSTER_GRPC_SOCKET

Compatibility Notes

  • Repository loading: Has a higher default timeout (180s vs 60s for general calls) because loading large repositories with many assets can be slow.
  • Shutdown grace period: Defaults to the maximum of all configured timeouts to ensure in-flight calls complete during graceful shutdown.
  • Unix sockets vs TCP: Unix sockets (DAGSTER_GRPC_SOCKET) provide better performance for same-machine communication. TCP (DAGSTER_GRPC_PORT) is required for cross-machine setups.
  • Code server auto-restart: The code server restarts every 30 seconds by default to pick up code changes. Adjust DAGSTER_CODE_SERVER_AUTO_RESTART_INTERVAL in production.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment