Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Kafka Connect Distributed Script

From Leeroopedia
Revision as of 14:18, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Kafka_Connect_Distributed_Script.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Kafka_Connect, Distributed_Systems, CLI
Last Updated 2026-02-09 12:00 GMT

Overview

External Tool Doc for launching Kafka Connect in distributed mode via the connect-distributed.sh shell script.

Description

The connect-distributed.sh script is the primary entry point for running Kafka Connect in distributed mode. It is a shell wrapper that configures Log4j2 logging defaults, JVM heap settings, and process naming before delegating to kafka-run-class.sh to launch the org.apache.kafka.connect.cli.ConnectDistributed Java class. Distributed mode enables multiple Connect workers to coordinate via a Kafka cluster, providing scalability and fault tolerance for connector tasks.

Usage

Use this script to start a Kafka Connect worker in distributed mode for production deployments where multiple workers share connector and task assignments across a cluster. This is the recommended deployment model for Kafka Connect in production.

Code Reference

Source Location

Signature

#!/bin/bash
# Usage: connect-distributed.sh [-daemon] connect-distributed.properties

# Environment variables:
#   KAFKA_LOG4J_OPTS   - Log4j2 configuration (default: -Dlog4j2.configurationFile=.../config/connect-log4j2.yaml)
#   KAFKA_HEAP_OPTS    - JVM heap settings (default: -Xms256M -Xmx2G)
#   EXTRA_ARGS         - Additional arguments for kafka-run-class.sh

# Delegates to:
exec kafka-run-class.sh $EXTRA_ARGS org.apache.kafka.connect.cli.ConnectDistributed "$@"

Import

# No import required; invoke directly from the Kafka installation bin/ directory:
bin/connect-distributed.sh connect-distributed.properties

I/O Contract

Inputs

Name Type Required Description
properties_file File path Yes Path to the Connect distributed worker configuration file (e.g., connect-distributed.properties)
-daemon Flag No Run the Connect worker as a background daemon process
KAFKA_LOG4J_OPTS Env var No Custom Log4j2 configuration; defaults to connect-log4j2.yaml
KAFKA_HEAP_OPTS Env var No JVM heap settings; defaults to -Xms256M -Xmx2G

Outputs

Name Type Description
Connect worker process JVM process A running Kafka Connect distributed worker that joins the Connect cluster
Log files Files Log output as configured by Log4j2 (connect-log4j2.yaml)

Usage Examples

Start Connect in Distributed Mode

# Start a Kafka Connect distributed worker with default settings
bin/connect-distributed.sh config/connect-distributed.properties

Start as Daemon

# Start Connect distributed worker in the background
bin/connect-distributed.sh -daemon config/connect-distributed.properties

Custom Heap and Logging

# Override heap settings and logging configuration
export KAFKA_HEAP_OPTS="-Xms512M -Xmx4G"
export KAFKA_LOG4J_OPTS="-Dlog4j2.configurationFile=/path/to/custom-log4j2.yaml"
bin/connect-distributed.sh config/connect-distributed.properties

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment