Implementation:TobikoData Sqlmesh Sqlglot Worker
| Knowledge Sources | |
|---|---|
| Domains | Web_UI, SQL_Parsing, Web_Workers |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
Web Worker running Pyodide with sqlglot for SQL validation, formatting, and dialect detection in the browser.
Description
The Sqlglot_Worker is a Web Worker that loads Pyodide (Python in WebAssembly) and the sqlglot library to provide SQL parsing and transformation capabilities directly in the browser. It handles SQL validation, SQL formatting, dialect detection, and keyword/type extraction for syntax highlighting. The worker loads Python asynchronously from a CDN and communicates via a message-passing protocol.
Usage
Use this worker to validate SQL syntax, format SQL queries, detect available dialects, and extract dialect-specific keywords for editor syntax highlighting. All operations run off the main thread to prevent UI blocking.
Code Reference
Source Location
- Repository: TobikoData_Sqlmesh
- File: web/client/src/workers/sqlglot/sqlglot.ts
Message Protocol
// Validate SQL
Input: { topic: 'validate', payload: string }
Output: { topic: 'validate', payload: boolean }
// Get dialect info
Input: { topic: 'dialect', payload: string }
Output: { topic: 'dialect', payload: { keywords: string, types: string } }
// Get all dialects
Input: { topic: 'dialects' }
Output: { topic: 'dialects', payload: Array<{ dialect_title: string, dialect_name: string }> }
// Format SQL
Input: { topic: 'format', payload: { sql: string, dialect: string } }
Output: { topic: 'format', payload: string }
// Initialization complete
Input: { topic: 'init' }
Output: { topic: 'init' }
Import
// Worker is typically loaded via URL
import { sqlglotWorker } from '~/workers'
// Or create directly
const sqlglotWorker = new Worker(
new URL('./workers/sqlglot/sqlglot.ts', import.meta.url),
{ type: 'module' }
)
I/O Contract
validate
| Direction | Field | Type | Description |
|---|---|---|---|
| Input | topic | 'validate' | Message type |
| Input | payload | string | SQL query to validate |
| Output | topic | 'validate' | Message type |
| Output | payload | boolean | True if valid SQL, false otherwise |
dialect
| Direction | Field | Type | Description |
|---|---|---|---|
| Input | topic | 'dialect' | Message type |
| Input | payload | string | Dialect name (e.g., 'postgres', 'snowflake') |
| Output | topic | 'dialect' | Message type |
| Output | payload.keywords | string | Space-separated SQL keywords |
| Output | payload.types | string | Space-separated SQL data types |
dialects
| Direction | Field | Type | Description |
|---|---|---|---|
| Input | topic | 'dialects' | Message type |
| Output | topic | 'dialects' | Message type |
| Output | payload | Array | List of {dialect_title, dialect_name} objects |
format
| Direction | Field | Type | Description |
|---|---|---|---|
| Input | topic | 'format' | Message type |
| Input | payload.sql | string | SQL query to format |
| Input | payload.dialect | string | Target dialect |
| Output | topic | 'format' | Message type |
| Output | payload | string | Formatted SQL query |
init
| Direction | Field | Type | Description |
|---|---|---|---|
| Input | topic | 'init' | Message type |
| Output | topic | 'init' | Message type (sent when ready) |
Implementation Details
Pyodide Loading
- Dynamically imports Pyodide from jsDelivr CDN (v0.23.2)
- Loads Pyodide runtime with full Python stdlib
- Uses micropip to install sqlglot and typing-extensions packages
- Fetches and runs sqlglot.py Python wrapper script
- Posts 'init' message when ready
Error Handling
All operations wrapped in try/catch:
- validate: Returns false on error
- dialect: Returns empty strings on error
- dialects: Returns empty array on error
- format: Returns empty string on error
Python Integration
The worker runs a Python script (sqlglot.py) that:
- Wraps sqlglot functionality
- Serializes results to JSON
- Handles dialect enumeration
- Provides SQL validation logic
Async Initialization
The worker sets up message handlers only after Pyodide loads:
- Import Pyodide script
- Call loadPyodideAndPackages()
- Set up onmessage handler
- Process messages using loaded sqlglot module
Usage Examples
import { sqlglotWorker } from '~/workers'
// Wait for initialization
sqlglotWorker.postMessage({ topic: 'init' })
sqlglotWorker.onmessage = (e: MessageEvent) => {
if (e.data.topic === 'init') {
console.log('SQLGlot worker ready')
// Now safe to use other features
validateSQL()
loadDialects()
}
}
// Validate SQL query
function validateSQL() {
sqlglotWorker.postMessage({
topic: 'validate',
payload: 'SELECT * FROM users WHERE id = 1'
})
sqlglotWorker.onmessage = (e: MessageEvent) => {
if (e.data.topic === 'validate') {
console.log('Valid SQL:', e.data.payload) // true or false
}
}
}
// Get all available dialects
function loadDialects() {
sqlglotWorker.postMessage({ topic: 'dialects' })
sqlglotWorker.onmessage = (e: MessageEvent) => {
if (e.data.topic === 'dialects') {
e.data.payload.forEach(dialect => {
console.log(`${dialect.dialect_title}: ${dialect.dialect_name}`)
})
}
}
}
// Get dialect-specific keywords for syntax highlighting
function getDialectInfo(dialectName: string) {
sqlglotWorker.postMessage({
topic: 'dialect',
payload: dialectName
})
sqlglotWorker.onmessage = (e: MessageEvent) => {
if (e.data.topic === 'dialect') {
const { keywords, types } = e.data.payload
console.log('Keywords:', keywords.split(' '))
console.log('Types:', types.split(' '))
}
}
}
// Format SQL query
function formatSQL(sql: string, dialect: string) {
sqlglotWorker.postMessage({
topic: 'format',
payload: { sql, dialect }
})
sqlglotWorker.onmessage = (e: MessageEvent) => {
if (e.data.topic === 'format') {
const formatted = e.data.payload
console.log('Formatted SQL:\n', formatted)
}
}
}
// Complete example with dialect switching
class SQLEditor {
worker: Worker = sqlglotWorker
async init() {
return new Promise((resolve) => {
this.worker.postMessage({ topic: 'init' })
this.worker.onmessage = (e) => {
if (e.data.topic === 'init') resolve(true)
}
})
}
async validate(sql: string): Promise<boolean> {
return new Promise((resolve) => {
this.worker.postMessage({ topic: 'validate', payload: sql })
this.worker.onmessage = (e) => {
if (e.data.topic === 'validate') resolve(e.data.payload)
}
})
}
async format(sql: string, dialect: string): Promise<string> {
return new Promise((resolve) => {
this.worker.postMessage({
topic: 'format',
payload: { sql, dialect }
})
this.worker.onmessage = (e) => {
if (e.data.topic === 'format') resolve(e.data.payload)
}
})
}
}
Performance Considerations
- Initial load: Pyodide + sqlglot take 5-10 seconds to load initially
- Off main thread: All parsing/validation doesn't block UI
- CDN caching: Pyodide cached by browser after first load
- Memory usage: Python runtime uses ~50-100MB memory
Dependencies
- Pyodide v0.23.2: Python runtime in WebAssembly
- sqlglot: SQL parser and transpiler (installed via micropip)
- typing-extensions: Python typing support (sqlglot dependency)