Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Spotify Luigi Visualiser Web UI

From Leeroopedia
Revision as of 16:48, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Spotify_Luigi_Visualiser_Web_UI.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Template:Metadata

Overview

Concrete tool for real-time visual monitoring of pipeline execution state provided by Luigi's built-in web visualizer and REST API.

Description

Luigi ships with an integrated web-based monitoring interface served directly by the luigid central scheduler process. The system has two components: a server-side REST API implemented as a Tornado web application, and a client-side single-page application built with jQuery, Mustache.js templates, and D3.js/SVG for graph rendering.

When the scheduler starts, it binds a Tornado HTTP application that serves both the API endpoints (under /api/) and the static visualizer files (under /static/visualiser/). The root URL (/) redirects to static/visualiser/index.html.

Server-side components:

  • RPCHandler (luigi/server.py): A Tornado RequestHandler that dispatches GET/POST requests to /api/{method_name} by looking up the method name in the RPC_METHODS registry and calling the corresponding Scheduler method with JSON-decoded arguments. Supports CORS via a configurable cors section in Luigi configuration.
  • Task history handlers: AllRunHandler, SelectedRunHandler, RecentRunHandler, ByNameHandler, ByIdHandler, ByTaskIdHandler, and ByParamsHandler provide HTML-rendered views of task execution history backed by the DbTaskHistory database.
  • MetricsHandler: Exposes scheduler metrics at /metrics for integration with Prometheus or similar collectors.

Client-side components:

  • luigi.js: Defines the LuigiAPI class that wraps all REST API calls (task lists by status, dependency graphs, error traces, operational actions) with jQuery AJAX helpers.
  • visualiserApp.js: The main application controller. Manages the dashboard layout, task status info boxes with counts, DataTables filtering and search, task detail modals, and coordinates between the table view and graph view.
  • graph.js: Renders the task dependency DAG as an SVG visualization with color-coded nodes (red=FAILED, green=DONE, blue=RUNNING, yellow=PENDING, gray=DISABLED, purple=BATCH_RUNNING, black=UNKNOWN, magenta=TRUNCATED).

Usage

Access the Luigi web visualizer by:

  1. Starting the central scheduler: luigid --port 8082
  2. Opening a browser to http://localhost:8082/
  3. The dashboard shows task counts by status, a filterable task table, and dependency graph visualization.

REST API Endpoints

Core API (/api/{method})

All scheduler RPC methods are exposed under the /api/ prefix. Parameters are passed as a JSON-encoded data query parameter or POST body. Responses are wrapped in {"response": ...}.

Endpoint Method Parameters Description
/api/task_list GET/POST status, upstream_status, search List tasks filtered by status and optional upstream status.
/api/dep_graph GET/POST task_id, include_done Get dependency graph rooted at a task.
/api/inverse_dep_graph GET/POST task_id, include_done Get inverse (downstream) dependency graph.
/api/fetch_error GET/POST task_id Retrieve error trace for a failed task.
/api/get_task_status_message GET/POST task_id Get the current status message of a task.
/api/get_task_progress_percentage GET/POST task_id Get the progress percentage of a running task.
/api/re_enable_task GET/POST task_id Re-enable a disabled task.
/api/mark_as_done GET/POST task_id Manually mark a task as done.
/api/forgive_failures GET/POST task_id Reset failure count for a task.
/api/add_task GET/POST task_id, worker, status, ... Register a task with the scheduler.
/api/get_work GET/POST worker Request a task assignment for a worker.
/api/ping GET/POST worker Worker heartbeat.
/api/prune GET/POST (none) Trigger manual pruning of stale tasks and workers.

Task History Endpoints

Endpoint Method Description
/ GET Redirects to static/visualiser/index.html.
/ HEAD Health check endpoint (returns 204).
/tasklist GET HTML page listing all tasks with execution history.
/tasklist/{task_name} GET HTML page with running-time visualization for a specific task.
/history GET HTML page showing tasks updated in the past 24 hours.
/history/by_name/{name} GET HTML page showing all runs of a named task.
/history/by_id/{id} GET HTML page showing details of a task by record ID.
/history/by_task_id/{task_id} GET HTML page showing details of a task by task ID.
/history/by_params/{name}?data={json} GET HTML page showing tasks matching name and parameters.
/metrics GET Prometheus-compatible metrics endpoint.

Key JavaScript Files

File Role
luigi/static/visualiser/js/luigi.js LuigiAPI class: wraps all REST API calls with jQuery AJAX. Provides methods like getDependencyGraph(), getFailedTaskList(), reEnable(), markAsDone(), getErrorTrace().
luigi/static/visualiser/js/visualiserApp.js Main application controller: manages dashboard layout, task status filtering by category (PENDING, RUNNING, DONE, FAILED, DISABLED), DataTables integration, task detail rendering with Mustache templates.
luigi/static/visualiser/js/graph.js DAG visualization engine: renders task dependency graphs as SVG with color-coded nodes, edge routing, depth-first layout, and legend.
luigi/static/visualiser/js/util.js Shared utility functions.
luigi/static/visualiser/js/tipsy.js Tooltip library for hover-over task information on graph nodes.

CORS Configuration

The REST API supports Cross-Origin Resource Sharing (CORS) for browser-based integrations from other domains. Configure via luigi.cfg:

[cors]
enabled = true
allow_any_origin = false
allowed_origins = ["https://dashboard.example.com", "https://monitoring.example.com"]
max_age = 86400
allowed_methods = GET, OPTIONS
allowed_headers = Accept, Content-Type, Origin
allow_credentials = false

Usage Examples

Accessing the Web Dashboard

# Start the scheduler
luigid --port 8082

# Open the dashboard in a browser
# URL: http://localhost:8082/
# This redirects to http://localhost:8082/static/visualiser/index.html

Querying the API with curl

# List all RUNNING tasks
curl 'http://localhost:8082/api/task_list?data=%7B%22status%22%3A%22RUNNING%22%2C%22upstream_status%22%3A%22%22%2C%22search%22%3A%22%22%7D'

# Get the dependency graph for a specific task
curl 'http://localhost:8082/api/dep_graph?data=%7B%22task_id%22%3A%22MyTask(date%3D2026-02-10)%22%7D'

# Fetch the error trace for a failed task
curl 'http://localhost:8082/api/fetch_error?data=%7B%22task_id%22%3A%22FailedTask(date%3D2026-02-10)%22%7D'

# Re-enable a disabled task
curl 'http://localhost:8082/api/re_enable_task?data=%7B%22task_id%22%3A%22DisabledTask(date%3D2026-02-10)%22%7D'

# Health check
curl -I http://localhost:8082/
# Returns HTTP 204 No Content

Querying the API with Python

import json
import urllib.request
import urllib.parse

SCHEDULER_URL = "http://localhost:8082"

def query_scheduler(method, params=None):
    """Query the Luigi scheduler REST API."""
    params = params or {}
    data = urllib.parse.urlencode({"data": json.dumps(params)})
    url = f"{SCHEDULER_URL}/api/{method}?{data}"
    response = urllib.request.urlopen(url)
    return json.loads(response.read().decode("utf-8"))["response"]

# List all failed tasks
failed_tasks = query_scheduler("task_list", {
    "status": "FAILED",
    "upstream_status": "",
    "search": ""
})
for task_id, info in failed_tasks.items():
    print(f"FAILED: {task_id} (last updated: {info['last_updated']})")

# Get dependency graph for a task
graph = query_scheduler("dep_graph", {
    "task_id": "MyTask(date=2026-02-10)",
    "include_done": True
})
for task_id, node in graph.items():
    print(f"  {task_id}: status={node['status']}, deps={node['deps']}")

Viewing Task Execution History

# View all tasks updated in the past 24 hours (HTML page)
# Open in browser: http://localhost:8082/history

# View all runs of a specific task family (HTML page)
# Open in browser: http://localhost:8082/history/by_name/MyTask

# View task execution time trends (HTML page)
# Open in browser: http://localhost:8082/tasklist/MyTask

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment