Environment:Spotify Luigi SQLAlchemy Database
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Database |
| Last Updated | 2026-02-10 07:00 GMT |
Overview
SQLAlchemy ORM environment for Luigi's database task history persistence and generic database-agnostic data loading.
Description
This environment provides the SQLAlchemy dependency used by two Luigi components: DbTaskHistory (task execution history storage using SQLAlchemy ORM) and the sqla contrib module (generic database-agnostic `CopyToTable` using SQLAlchemy Core). DbTaskHistory uses declarative ORM models to store task events, parameters, and status in any SQLAlchemy-supported database. The sqla contrib module uses SQLAlchemy Core for database-agnostic table operations.
Usage
Use this environment when enabling task history recording in the central scheduler (requires `[task_history] db_connection` config) or when using the database-agnostic CopyToTable from `luigi.contrib.sqla`. Required for the Central_Scheduler_Deployment workflow when `record_task_history=True`.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows | Cross-platform |
| Database | Any SQLAlchemy-supported database | SQLite, PostgreSQL, MySQL, etc. |
| Network | Access to database server | Unless using SQLite (file-based) |
Dependencies
Python Packages
- `sqlalchemy` (required; version varies by use case)
- `luigi` (core)
- Database-specific driver (e.g., `psycopg2` for PostgreSQL, `mysql-connector-python` for MySQL)
Credentials
Configuration in `luigi.cfg`:
- `[task_history] db_connection`: SQLAlchemy connection string (e.g., `sqlite:///luigi-task-hist.db`, `postgresql://user:pass@host/db`)
The connection string format follows SQLAlchemy conventions and includes credentials inline.
Quick Install
# For task history with SQLite (simplest)
pip install luigi sqlalchemy
# For task history with PostgreSQL
pip install luigi sqlalchemy "psycopg2<3.0"
# For generic CopyToTable via sqla
pip install luigi sqlalchemy
Code Evidence
SQLAlchemy imports for DbTaskHistory from `luigi/db_task_history.py:47-52`:
import sqlalchemy
import sqlalchemy.ext.declarative
import sqlalchemy.orm
import sqlalchemy.orm.collections
from sqlalchemy.engine import reflection
Base = sqlalchemy.ext.declarative.declarative_base()
Database initialization from `luigi/db_task_history.py:80-83`:
connection_string = config.get('task_history', 'db_connection')
self.engine = sqlalchemy.create_engine(connection_string)
self.session_factory = sqlalchemy.orm.sessionmaker(
bind=self.engine, expire_on_commit=False)
Base.metadata.create_all(self.engine)
SQLAlchemy usage in sqla contrib from `luigi/contrib/sqla.py:150-265`:
import sqlalchemy
# Engine and table operations
engine = sqlalchemy.create_engine(connection_string)
metadata = sqlalchemy.MetaData()
table = sqlalchemy.Table(...)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ImportError: No module named 'sqlalchemy'` | SQLAlchemy not installed | `pip install sqlalchemy` |
| `OperationalError: unable to open database file` | SQLite path not writable | Check file permissions on the db_connection path |
| `NoSuchModuleError: Can't load plugin` | Database driver not installed | Install the appropriate driver (e.g., psycopg2, mysql-connector-python) |
| `[task_history] db_connection not configured` | Missing configuration | Add `[task_history] db_connection` to luigi.cfg |
Compatibility Notes
- SQLAlchemy versions: The test suite pins `sqlalchemy < 1.4` for some test groups, but the production code is compatible with newer versions for basic usage.
- Database backends: Any SQLAlchemy-supported backend works. SQLite is simplest for development; PostgreSQL or MySQL recommended for production.
- Auto-creation: `Base.metadata.create_all(self.engine)` automatically creates the task history tables if they don't exist.