Implementation:Neuml Txtai SQL Parser
| Knowledge Sources | |
|---|---|
| Domains | SQL_Parsing, Query_Processing |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
The SQL class translates txtai's extended SQL dialect into database-native queries, enabling semantic search operations to be expressed through familiar SQL syntax.
Description
The SQL class serves as the query translation layer within txtai's database subsystem. It parses SQL statements that may contain txtai-specific extensions (such as similarity search clauses) and converts them into structures the underlying database engine can execute. The parser handles tokenization of the SQL string, identification of txtai-specific functions like similar() and snippet(), and production of a parsed query dictionary that the database layer consumes.
Usage
Use the SQL class when you need to programmatically parse or inspect txtai SQL queries before they reach the database engine. This is primarily used internally by txtai's database implementations, but can be useful when building custom database backends or when you need to validate and transform user-supplied queries.
Code Reference
Source Location
- Repository: Neuml_Txtai
- File: src/python/txtai/database/sql/base.py
- Lines: 1-189
Signature
class SQL:
def __init__(self, database=None, tolist=False):
"""
Creates a new SQL parser.
Args:
database: database instance, used to resolve column types
tolist: if True, converts results to list format
"""
def __call__(self, query):
"""
Parses a txtai SQL query and returns a parsed query dict.
Args:
query: SQL query string
Returns:
dict with parsed query structure
"""
def issql(self, query):
"""Checks if query is a SQL statement."""
def snippet(self, column):
"""Generates a snippet expression for the given column."""
def tokenize(self, query):
"""Tokenizes a SQL query string into components."""
def parse(self, query):
"""Parses tokenized SQL into a structured query dict."""
Import
from txtai.database.sql import SQL
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| database | object | No | Database instance used to resolve column types and metadata |
| tolist | bool | No | When True, converts result rows to list format (default: False) |
| query | str | Yes (for __call__) | SQL query string to parse, may contain txtai extensions such as similar() clauses |
Outputs
| Name | Type | Description |
|---|---|---|
| result | dict | Parsed query dictionary containing keys for select columns, where clauses, similarity parameters, order, limit, and other SQL components |
| issql | bool | Whether a given string is recognized as a SQL statement |
Usage Examples
Basic Usage
from txtai.database.sql import SQL
# Create a SQL parser
parser = SQL()
# Check if a string is a SQL query
query = "SELECT id, text, score FROM txtai WHERE similar('machine learning') LIMIT 10"
if parser.issql(query):
# Parse the query into a structured dict
parsed = parser(query)
print(parsed)
# Output includes parsed select columns, similarity clause, and limit
Parsing with Database Context
from txtai.embeddings import Embeddings
from txtai.database.sql import SQL
# Build embeddings with content storage
embeddings = Embeddings(content=True)
embeddings.index([
{"text": "Deep learning advances", "label": "ai"},
{"text": "Natural language processing", "label": "nlp"}
])
# Create parser with database context for column resolution
parser = SQL(database=embeddings.database)
# Parse a query with filters
parsed = parser("SELECT text, score FROM txtai WHERE similar('neural networks') AND label = 'ai'")
print(parsed)