Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:DataExpert io Data engineer handbook SparkSession Builder

From Leeroopedia


Overview

This page documents the SparkSession builder usage within the Data Engineer Handbook repository. The SparkSession builder is an external PySpark API used to initialize a Spark session for all PySpark job execution in this project.

Type

Wrapper Doc (external PySpark API used by this repo)

Source

players_scd_job.py:L48-51

Signature

SparkSession.builder.master("local").appName("players_scd").getOrCreate() -> SparkSession

Import

from pyspark.sql import SparkSession

Inputs / Outputs

Direction Name Type Description
Input master URL str The Spark master URL (e.g., "local" for local mode)
Input app name str The application name displayed in the Spark UI (e.g., "players_scd")
Output SparkSession SparkSession A fully initialized SparkSession instance ready for DataFrame and SQL operations

Usage Example

from pyspark.sql import SparkSession

spark = (SparkSession.builder
    .master("local")
    .appName("players_scd")
    .getOrCreate())

Related Pages

Knowledge Sources

Metadata

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment