Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Apache Airflow Task Communication

From Leeroopedia


Knowledge Sources
Domains Workflow_Orchestration, Data_Engineering
Last Updated 2026-02-08 00:00 GMT

Overview

A mechanism for passing data between tasks in an Airflow DAG via a shared metadata store.

Description

Task Communication in Airflow is achieved through XCom (Cross-Communication), a system for exchanging small amounts of data between tasks. XCom values are stored in the metadata database as JSON and are keyed by a combination of dag_id, task_id, run_id, map_index, and a user-defined key. The TaskFlow API automatically pushes return values as XCom and pulls them as function arguments, making inter-task data flow implicit and Pythonic.

Usage

Use XCom when tasks need to share small amounts of metadata, configuration, or results. For large datasets, use external storage (S3, GCS) and pass only references via XCom. XCom is essential for dynamic workflows where downstream task behavior depends on upstream results.

Theoretical Basis

Message Passing Pattern:

  • Producer: Task pushes a value with a key (explicit or via return value)
  • Store: Metadata database persists the value as JSON
  • Consumer: Downstream task pulls the value by key

Automatic XCom (TaskFlow):

# Pseudo-code: TaskFlow automatic XCom
result = upstream_task()           # Return value auto-pushed as XCom
downstream_task(data=result)       # XCom auto-pulled as argument

Key constraints:

  • Values must be JSON-serializable
  • Default size limit depends on database backend (PostgreSQL JSONB is most flexible)
  • Custom XCom backends can override storage mechanism

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment