Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Spark SparkSession Builder

From Leeroopedia
Revision as of 14:25, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Spark_SparkSession_Builder.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Metadata

Field Value
Source Type Doc
Source Name Spark Quick Start
Source URL https://spark.apache.org/docs/latest/quick-start.html
Source Type Repo
Source Name Apache Spark
Source URL https://github.com/apache/spark
Domains Application_Development
Type Wrapper Doc

Overview

Wrapper documentation for the SparkSession.builder API used to create unified Spark application entry points.

Description

SparkSession.builder is the standard way to create or retrieve a SparkSession in Spark applications. Available in Python, Scala, and Java with identical builder pattern semantics. The getOrCreate() method provides singleton behavior — returning an existing session if one is already active in the JVM, or creating a new one with the specified configuration.

Key characteristics:

  • Fluent API — method chaining for readable configuration
  • Cross-language consistency — identical semantics in Python, Scala, and Java
  • Singleton guaranteegetOrCreate() prevents duplicate sessions
  • Lazy materialization — no resources are allocated until getOrCreate() is called

Usage

Use at the start of every Spark application to initialize the session. This is the first API call in any Spark program.

Code Reference

Source: docs/quick-start.md (L258-405)

Signature

Python:

SparkSession.builder.appName(name).master(url).config(key, value).getOrCreate()

Scala:

SparkSession.builder.appName(name).master(url).config(key, value).getOrCreate()

Imports

Python:

from pyspark.sql import SparkSession

Scala:

import org.apache.spark.sql.SparkSession

I/O

Inputs

Parameter Type Required Description
appName String Yes Human-readable name for the Spark application
master String No Cluster manager URL (e.g., local[*], spark://host:port, yarn)
config properties Map No Key-value configuration pairs (e.g., spark.executor.memory)

Outputs

Output Type Description
SparkSession SparkSession instance The unified entry point for all Spark operations

Examples

Python

spark = SparkSession.builder.appName("MyApp").getOrCreate()

Scala

val spark = SparkSession.builder.appName("MyApp").getOrCreate()

Java

SparkSession spark = SparkSession.builder().appName("MyApp").getOrCreate();

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment