Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Spotify Luigi Luigid Server

From Leeroopedia


Template:Metadata

Overview

Concrete tool for running a central scheduling service for production pipeline orchestration provided by Luigi.

Description

The luigid command launches the Luigi central scheduler as a standalone server process. It combines three subsystems:

  • Command-line entry point (luigi.cmdline.luigid): Parses arguments for port, address, PID file, log directory, state path, and background mode. Optionally overrides the scheduler's state_path via the configuration system before starting the server.
  • Tornado HTTP server (luigi.server.run): Creates a Scheduler instance, loads any persisted state from disk, binds a Tornado web application to the specified port (default 8082) or Unix socket, and starts the event loop. A periodic callback prunes stale tasks and workers every 60 seconds. Signal handlers (SIGINT, SIGTERM, SIGQUIT) ensure state is dumped to disk on shutdown.
  • Daemon process management (luigi.process.daemonize): When the --background flag is set, the process detaches from the terminal using the python-daemon library, redirects stdout/stderr to date-stamped log files under the specified log directory, writes a PID file for process management, and sets up rotating log handlers.

The server exposes all scheduler methods as HTTP endpoints under /api/{method_name}, where each method registered via the @rpc_method() decorator on the Scheduler class becomes a callable endpoint that accepts JSON-encoded parameters.

Usage

Use luigid when:

  • You are deploying Luigi in a multi-worker or multi-machine environment and need a central scheduler.
  • You want persistent task state that survives process restarts.
  • You need the web-based visualizer for monitoring pipeline execution.
  • Production deployments require daemonized operation with PID files and log rotation.

Code Reference

Source Location

File Lines Role
luigi/cmdline.py L12-37 luigid() entry point: argument parsing and dispatch
luigi/server.py L345-393 run(): server startup, event loop, signal handling, shutdown
luigi/scheduler.py L667-700 Scheduler.__init__(): state initialization, history backend selection
luigi/process.py L77-126 daemonize(): background process management

Signature

# luigi/cmdline.py
def luigid(argv=sys.argv[1:]):
    """Central luigi server entry point."""

# luigi/server.py
def run(api_port=8082, address=None, unix_socket=None, scheduler=None):
    """Runs one instance of the API server."""

# luigi/scheduler.py
class Scheduler:
    def __init__(self, config=None, resources=None, task_history_impl=None, **kwargs):
        """
        Keyword Arguments:
        :param config: an object of class "scheduler" or None
        :param resources: a dict of str->int constraints
        :param task_history_impl: ignore config and use this object as the task history
        """

    def load(self): ...
    def dump(self): ...

# luigi/process.py
def daemonize(cmd, pidfile=None, logdir=None, api_port=8082, address=None, unix_socket=None):
    """Run cmd as a daemon process."""

Import

# Typically invoked via the command line:
#   luigid --port 8082 --background --pidfile /var/run/luigi/luigid.pid

# Programmatic usage:
import luigi.server
luigi.server.run(api_port=8082)

I/O Contract

Inputs

Parameter Type Description
--port int TCP port to listen on (default: 8082).
--address str Network interface to bind to (default: all interfaces).
--unix-socket str Path to a Unix domain socket (alternative to TCP).
--background flag Run as a background daemon process.
--pidfile str Path to write the PID file for process management.
--logdir str Directory for log files (default: /var/log/luigi).
--state-path str Override the pickle state file path (default from config: /var/lib/luigi-server/state.pickle).

Outputs

Output Description
HTTP API on /api/* JSON-over-HTTP endpoints for all scheduler RPC methods.
Web UI on / Redirects to the static visualizer at /static/visualiser/index.html.
State pickle file Serialized scheduler state persisted on shutdown and periodic intervals.
PID file Process identifier written when running in background mode.
Log files Rotating log files in the configured log directory.

Usage Examples

Starting the Scheduler in Foreground

# Start luigid on port 8082 (foreground)
luigid --port 8082

Starting the Scheduler as a Daemon

# Start luigid as a background daemon with PID file and log directory
luigid --background \
    --port 8082 \
    --pidfile /var/run/luigi/luigid.pid \
    --logdir /var/log/luigi \
    --state-path /var/lib/luigi-server/state.pickle

Starting the Scheduler Programmatically

import luigi.server
from luigi.scheduler import Scheduler

# Create a scheduler with custom configuration
sched = Scheduler()
sched.load()

# Start the server on a custom port
luigi.server.run(api_port=9090, scheduler=sched)

Verifying the Scheduler is Running

# Health check using HEAD request
curl -I http://localhost:8082/

# Check scheduler API
curl http://localhost:8082/api/get_work?data=%7B%22worker%22%3A%22test%22%7D

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment