Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:TobikoData Sqlmesh Sqlglot Worker

From Leeroopedia


Knowledge Sources
Domains Web_UI, SQL_Parsing, Web_Workers
Last Updated 2026-02-07 20:00 GMT

Overview

Web Worker running Pyodide with sqlglot for SQL validation, formatting, and dialect detection in the browser.

Description

The Sqlglot_Worker is a Web Worker that loads Pyodide (Python in WebAssembly) and the sqlglot library to provide SQL parsing and transformation capabilities directly in the browser. It handles SQL validation, SQL formatting, dialect detection, and keyword/type extraction for syntax highlighting. The worker loads Python asynchronously from a CDN and communicates via a message-passing protocol.

Usage

Use this worker to validate SQL syntax, format SQL queries, detect available dialects, and extract dialect-specific keywords for editor syntax highlighting. All operations run off the main thread to prevent UI blocking.

Code Reference

Source Location

Message Protocol

// Validate SQL
Input:  { topic: 'validate', payload: string }
Output: { topic: 'validate', payload: boolean }

// Get dialect info
Input:  { topic: 'dialect', payload: string }
Output: { topic: 'dialect', payload: { keywords: string, types: string } }

// Get all dialects
Input:  { topic: 'dialects' }
Output: { topic: 'dialects', payload: Array<{ dialect_title: string, dialect_name: string }> }

// Format SQL
Input:  { topic: 'format', payload: { sql: string, dialect: string } }
Output: { topic: 'format', payload: string }

// Initialization complete
Input:  { topic: 'init' }
Output: { topic: 'init' }

Import

// Worker is typically loaded via URL
import { sqlglotWorker } from '~/workers'

// Or create directly
const sqlglotWorker = new Worker(
  new URL('./workers/sqlglot/sqlglot.ts', import.meta.url),
  { type: 'module' }
)

I/O Contract

validate

Direction Field Type Description
Input topic 'validate' Message type
Input payload string SQL query to validate
Output topic 'validate' Message type
Output payload boolean True if valid SQL, false otherwise

dialect

Direction Field Type Description
Input topic 'dialect' Message type
Input payload string Dialect name (e.g., 'postgres', 'snowflake')
Output topic 'dialect' Message type
Output payload.keywords string Space-separated SQL keywords
Output payload.types string Space-separated SQL data types

dialects

Direction Field Type Description
Input topic 'dialects' Message type
Output topic 'dialects' Message type
Output payload Array List of {dialect_title, dialect_name} objects

format

Direction Field Type Description
Input topic 'format' Message type
Input payload.sql string SQL query to format
Input payload.dialect string Target dialect
Output topic 'format' Message type
Output payload string Formatted SQL query

init

Direction Field Type Description
Input topic 'init' Message type
Output topic 'init' Message type (sent when ready)

Implementation Details

Pyodide Loading

  1. Dynamically imports Pyodide from jsDelivr CDN (v0.23.2)
  2. Loads Pyodide runtime with full Python stdlib
  3. Uses micropip to install sqlglot and typing-extensions packages
  4. Fetches and runs sqlglot.py Python wrapper script
  5. Posts 'init' message when ready

Error Handling

All operations wrapped in try/catch:

  • validate: Returns false on error
  • dialect: Returns empty strings on error
  • dialects: Returns empty array on error
  • format: Returns empty string on error

Python Integration

The worker runs a Python script (sqlglot.py) that:

  • Wraps sqlglot functionality
  • Serializes results to JSON
  • Handles dialect enumeration
  • Provides SQL validation logic

Async Initialization

The worker sets up message handlers only after Pyodide loads:

  1. Import Pyodide script
  2. Call loadPyodideAndPackages()
  3. Set up onmessage handler
  4. Process messages using loaded sqlglot module

Usage Examples

import { sqlglotWorker } from '~/workers'

// Wait for initialization
sqlglotWorker.postMessage({ topic: 'init' })

sqlglotWorker.onmessage = (e: MessageEvent) => {
  if (e.data.topic === 'init') {
    console.log('SQLGlot worker ready')

    // Now safe to use other features
    validateSQL()
    loadDialects()
  }
}

// Validate SQL query
function validateSQL() {
  sqlglotWorker.postMessage({
    topic: 'validate',
    payload: 'SELECT * FROM users WHERE id = 1'
  })

  sqlglotWorker.onmessage = (e: MessageEvent) => {
    if (e.data.topic === 'validate') {
      console.log('Valid SQL:', e.data.payload) // true or false
    }
  }
}

// Get all available dialects
function loadDialects() {
  sqlglotWorker.postMessage({ topic: 'dialects' })

  sqlglotWorker.onmessage = (e: MessageEvent) => {
    if (e.data.topic === 'dialects') {
      e.data.payload.forEach(dialect => {
        console.log(`${dialect.dialect_title}: ${dialect.dialect_name}`)
      })
    }
  }
}

// Get dialect-specific keywords for syntax highlighting
function getDialectInfo(dialectName: string) {
  sqlglotWorker.postMessage({
    topic: 'dialect',
    payload: dialectName
  })

  sqlglotWorker.onmessage = (e: MessageEvent) => {
    if (e.data.topic === 'dialect') {
      const { keywords, types } = e.data.payload
      console.log('Keywords:', keywords.split(' '))
      console.log('Types:', types.split(' '))
    }
  }
}

// Format SQL query
function formatSQL(sql: string, dialect: string) {
  sqlglotWorker.postMessage({
    topic: 'format',
    payload: { sql, dialect }
  })

  sqlglotWorker.onmessage = (e: MessageEvent) => {
    if (e.data.topic === 'format') {
      const formatted = e.data.payload
      console.log('Formatted SQL:\n', formatted)
    }
  }
}

// Complete example with dialect switching
class SQLEditor {
  worker: Worker = sqlglotWorker

  async init() {
    return new Promise((resolve) => {
      this.worker.postMessage({ topic: 'init' })
      this.worker.onmessage = (e) => {
        if (e.data.topic === 'init') resolve(true)
      }
    })
  }

  async validate(sql: string): Promise<boolean> {
    return new Promise((resolve) => {
      this.worker.postMessage({ topic: 'validate', payload: sql })
      this.worker.onmessage = (e) => {
        if (e.data.topic === 'validate') resolve(e.data.payload)
      }
    })
  }

  async format(sql: string, dialect: string): Promise<string> {
    return new Promise((resolve) => {
      this.worker.postMessage({
        topic: 'format',
        payload: { sql, dialect }
      })
      this.worker.onmessage = (e) => {
        if (e.data.topic === 'format') resolve(e.data.payload)
      }
    })
  }
}

Performance Considerations

  • Initial load: Pyodide + sqlglot take 5-10 seconds to load initially
  • Off main thread: All parsing/validation doesn't block UI
  • CDN caching: Pyodide cached by browser after first load
  • Memory usage: Python runtime uses ~50-100MB memory

Dependencies

  • Pyodide v0.23.2: Python runtime in WebAssembly
  • sqlglot: SQL parser and transpiler (installed via micropip)
  • typing-extensions: Python typing support (sqlglot dependency)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment