Principle:ArroyoSystems Arroyo Connection Profile Management
Overview
The Connection Profile Management principle governs how Arroyo manages reusable connection configurations for external systems. A connection profile stores credentials and endpoint information (such as Kafka bootstrap servers, AWS credentials, or Redis connection strings) that can be shared across multiple connection tables. This follows the separation of concerns pattern, decoupling where to connect from what data to access.
Description
In a stream processing system that interacts with many external systems, connection configuration naturally divides into two layers:
- Profile-level configuration -- Credentials, endpoints, and authentication details that are shared across multiple tables using the same external system. For example, a single Kafka cluster's bootstrap servers and SASL credentials.
- Table-level configuration -- Settings specific to a particular data source or sink, such as the Kafka topic name, consumer group, or read offset.
Connection profiles implement the Template pattern for connection configuration. By separating these concerns, the system achieves several benefits:
- Credential reuse -- A single set of credentials can be referenced by many connection tables, eliminating duplication and reducing the surface area for credential management errors.
- Centralized credential management -- When credentials rotate (e.g., API key renewal), only the profile needs to be updated rather than every individual table that uses those credentials.
- Testable configurations -- Connection profiles can be tested independently before being used in table definitions, providing early validation of connectivity and authentication.
- Consistent configuration -- All tables referencing the same profile are guaranteed to use the same connection settings, preventing configuration drift.
The profile lifecycle consists of:
- Creation -- A user provides a name, connector type, and configuration JSON. The system validates the configuration against the connector's schema and persists it.
- Testing -- Optionally, the user can test the profile by triggering a live connection attempt to the external system.
- Reference -- Connection tables reference a profile by ID, inheriting its configuration.
- Deletion -- Profiles can be deleted only if no connection tables reference them (enforced by foreign key constraints).
Theoretical Basis
Connection profiles implement the Template pattern for connection configuration. The Template pattern defines a skeleton of configuration that is filled in by specific instances (connection tables). This is closely related to:
- Flyweight pattern -- Shared configuration state (the profile) is separated from instance-specific state (the table), reducing memory usage and configuration redundancy.
- Separation of Concerns -- Authentication/endpoint configuration is orthogonal to table-level configuration. Mixing them violates the single responsibility principle.
- Don't Repeat Yourself (DRY) -- Without profiles, every Kafka table connecting to the same cluster would need to independently specify bootstrap servers, SASL mechanism, username, and password.
The testability aspect follows the Fail-Fast principle -- by validating connectivity at the profile level before any tables are created, configuration errors are caught early in the workflow rather than at pipeline execution time.
Usage
Connection profiles are used in the following workflows:
- Web Console -- Users create connection profiles through the UI, providing connector-specific configuration. The UI uses connector metadata (from the Connector Registry) to render appropriate configuration forms.
- REST API -- The
POST /v1/connection_profilesendpoint creates profiles, andPOST /v1/connection_profiles/testvalidates them. - SQL DDL -- In SQL
CREATE TABLEstatements, theconnection_profileoption references a previously created profile by name. - Connection Tables -- When creating a connection table, the
connection_profile_idfield links the table to a profile.
Example: Creating and Using a Kafka Profile
# Create a Kafka connection profile
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
http://localhost:8000/v1/connection_profiles \
-d '{
"name": "production-kafka",
"connector": "kafka",
"config": {
"bootstrap_servers": "kafka-broker:9092",
"authentication": {
"sasl_mechanism": "PLAIN",
"username": "user",
"password": "secret"
}
}
}'
# Test the profile
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
http://localhost:8000/v1/connection_profiles/test \
-d '{
"name": "production-kafka",
"connector": "kafka",
"config": { ... }
}'
Example: SQL Reference
CREATE TABLE orders (
order_id BIGINT,
customer_id BIGINT,
amount DOUBLE
) WITH (
connector = 'kafka',
connection_profile = 'production-kafka',
topic = 'orders',
format = 'json'
);