Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ArroyoSystems Arroyo Planner Extensions

From Leeroopedia
Revision as of 14:27, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/ArroyoSystems_Arroyo_Planner_Extensions.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Overview

Planner Extensions defines the ArroyoExtension trait and supporting types that form the interface between DataFusion's logical plan framework and Arroyo's streaming dataflow graph construction. It also provides the type dispatch mechanism for converting DataFusion extension nodes to Arroyo extensions.

Description

The module defines:

  • ArroyoExtension trait: The core interface that all Arroyo logical plan extensions implement. Methods include:
    • node_name(): Optional named node for memoization in the graph builder
    • plan_node(): Converts the extension to a NodeWithIncomingEdges for the dataflow graph
    • output_schema(): Returns the Arroyo output schema
    • transparent(): Whether the node should be treated as transparent for graph construction (default false)
  • NodeWithIncomingEdges: Pairs a LogicalNode with its incoming LogicalEdge connections.
  • Type dispatch via TryFrom: Implements conversion from &dyn UserDefinedLogicalNode to &dyn ArroyoExtension by attempting downcast to each known extension type in sequence: TableSourceExtension, WatermarkNode, SinkExtension, KeyCalculationExtension, AggregateExtension, RemoteTableExtension, JoinExtension, WindowFunctionExtension, AsyncUDFExtension, ToDebeziumExtension, DebeziumUnrollingExtension, UpdatingAggregateExtension, LookupJoin, and ProjectionExtension.
  • AsyncUDFExtension: An extension for asynchronous user-defined function calls, supporting ordered, unordered, and in-order modes.
  • TimestampAppendExtension trait: A helper trait for extensions that need to append timestamp fields to their output schemas.

The module re-exports all extension submodules: aggregate, debezium, join, key_calculation, lookup, projection, remote_table, sink, table_source, updating_aggregate, watermark_node, and window_fn.

Usage

Every Arroyo-specific logical plan node implements ArroyoExtension. The plan graph builder uses the trait methods to construct the physical dataflow graph.

Code Reference

Source Location

crates/arroyo-planner/src/extension/mod.rs

Signature

pub(crate) trait ArroyoExtension: Debug {
    fn node_name(&self) -> Option<NamedNode>;
    fn plan_node(
        &self,
        planner: &Planner,
        index: usize,
        input_schemas: Vec<ArroyoSchemaRef>,
    ) -> Result<NodeWithIncomingEdges>;
    fn output_schema(&self) -> ArroyoSchema;
    fn transparent(&self) -> bool { false }
}

pub(crate) struct NodeWithIncomingEdges {
    pub node: LogicalNode,
    pub edges: Vec<LogicalEdge>,
}

impl<'a> TryFrom<&'a dyn UserDefinedLogicalNode> for &'a dyn ArroyoExtension {
    type Error = DataFusionError;
    fn try_from(node: &'a dyn UserDefinedLogicalNode) -> Result<Self, Self::Error>
}

Import

use crate::extension::{ArroyoExtension, NodeWithIncomingEdges};

I/O Contract

Inputs

Name Type Description
planner &Planner Planner context for physical plan generation
index usize Node index in the graph
input_schemas Vec<ArroyoSchemaRef> Schemas from upstream nodes

Outputs

Name Type Description
NodeWithIncomingEdges struct Graph node and its incoming edges
ArroyoSchema struct Output schema of the extension node

Usage Examples

// Converting a DataFusion extension node to an Arroyo extension
if let LogicalPlan::Extension(Extension { node }) = &plan {
    let arroyo_ext: &dyn ArroyoExtension = node.as_ref().try_into()?;
    let output = arroyo_ext.plan_node(&planner, idx, input_schemas)?;
    graph.add_node(output.node);
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment