Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Spotify Luigi SQLAlchemy Database

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Database
Last Updated 2026-02-10 07:00 GMT

Overview

SQLAlchemy ORM environment for Luigi's database task history persistence and generic database-agnostic data loading.

Description

This environment provides the SQLAlchemy dependency used by two Luigi components: DbTaskHistory (task execution history storage using SQLAlchemy ORM) and the sqla contrib module (generic database-agnostic `CopyToTable` using SQLAlchemy Core). DbTaskHistory uses declarative ORM models to store task events, parameters, and status in any SQLAlchemy-supported database. The sqla contrib module uses SQLAlchemy Core for database-agnostic table operations.

Usage

Use this environment when enabling task history recording in the central scheduler (requires `[task_history] db_connection` config) or when using the database-agnostic CopyToTable from `luigi.contrib.sqla`. Required for the Central_Scheduler_Deployment workflow when `record_task_history=True`.

System Requirements

Category Requirement Notes
OS Linux, macOS, Windows Cross-platform
Database Any SQLAlchemy-supported database SQLite, PostgreSQL, MySQL, etc.
Network Access to database server Unless using SQLite (file-based)

Dependencies

Python Packages

  • `sqlalchemy` (required; version varies by use case)
  • `luigi` (core)
  • Database-specific driver (e.g., `psycopg2` for PostgreSQL, `mysql-connector-python` for MySQL)

Credentials

Configuration in `luigi.cfg`:

  • `[task_history] db_connection`: SQLAlchemy connection string (e.g., `sqlite:///luigi-task-hist.db`, `postgresql://user:pass@host/db`)

The connection string format follows SQLAlchemy conventions and includes credentials inline.

Quick Install

# For task history with SQLite (simplest)
pip install luigi sqlalchemy

# For task history with PostgreSQL
pip install luigi sqlalchemy "psycopg2<3.0"

# For generic CopyToTable via sqla
pip install luigi sqlalchemy

Code Evidence

SQLAlchemy imports for DbTaskHistory from `luigi/db_task_history.py:47-52`:

import sqlalchemy
import sqlalchemy.ext.declarative
import sqlalchemy.orm
import sqlalchemy.orm.collections
from sqlalchemy.engine import reflection

Base = sqlalchemy.ext.declarative.declarative_base()

Database initialization from `luigi/db_task_history.py:80-83`:

connection_string = config.get('task_history', 'db_connection')
self.engine = sqlalchemy.create_engine(connection_string)
self.session_factory = sqlalchemy.orm.sessionmaker(
    bind=self.engine, expire_on_commit=False)
Base.metadata.create_all(self.engine)

SQLAlchemy usage in sqla contrib from `luigi/contrib/sqla.py:150-265`:

import sqlalchemy

# Engine and table operations
engine = sqlalchemy.create_engine(connection_string)
metadata = sqlalchemy.MetaData()
table = sqlalchemy.Table(...)

Common Errors

Error Message Cause Solution
`ImportError: No module named 'sqlalchemy'` SQLAlchemy not installed `pip install sqlalchemy`
`OperationalError: unable to open database file` SQLite path not writable Check file permissions on the db_connection path
`NoSuchModuleError: Can't load plugin` Database driver not installed Install the appropriate driver (e.g., psycopg2, mysql-connector-python)
`[task_history] db_connection not configured` Missing configuration Add `[task_history] db_connection` to luigi.cfg

Compatibility Notes

  • SQLAlchemy versions: The test suite pins `sqlalchemy < 1.4` for some test groups, but the production code is compatible with newer versions for basic usage.
  • Database backends: Any SQLAlchemy-supported backend works. SQLite is simplest for development; PostgreSQL or MySQL recommended for production.
  • Auto-creation: `Base.metadata.create_all(self.engine)` automatically creates the task history tables if they don't exist.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment