Implementation:Neuml Txtai Client Database
| Knowledge Sources | |
|---|---|
| Domains | Database, SQL |
| Last Updated | 2026-02-10 01:00 GMT |
Overview
Concrete tool for external database connectivity via SQLAlchemy provided by txtai.
Description
Client is a database backend class that connects txtai to external relational databases using SQLAlchemy. It extends RDBMS directly and supports any database that SQLAlchemy supports (PostgreSQL, MariaDB, MySQL, etc.) as long as the database has JSON column support. The class manages a SQLAlchemy engine, raw database connection, and ORM session. It overrides the insert methods to use SQLAlchemy ORM objects (Document, Object, Section, Batch, Score) rather than raw SQL. Table creation is handled via SQLAlchemy's Base.metadata.create_all(), and the class uses a custom Cursor wrapper that adapts SQLAlchemy's result objects to the Python DB-API interface. The connect() method reads the database URL from the config's content field (or the CLIENT_URL environment variable) and supports optional schema configuration for PostgreSQL. The jsoncolumn() method uses SQLAlchemy's cast() and dialect-specific JSON extraction to build dynamic column expressions. A companion Cursor class provides execute, fetchall, fetchone, and description compatibility with standard database cursors.
Usage
Use Client when txtai needs to store and query content in an external relational database rather than a local embedded file. Set the content configuration parameter to a SQLAlchemy-compatible connection URL (e.g., postgresql://user:pass@host/db) or set it to "client" to read the URL from the CLIENT_URL environment variable.
Code Reference
Source Location
- Repository: Neuml_Txtai
- File:
src/python/txtai/database/client.py - Lines: 1-227
Signature
class Client(RDBMS):
def __init__(self, config)
def save(self, path)
def close(self)
def reindexstart(self)
def reindexend(self, name)
def jsonprefix(self)
def jsoncolumn(self, name)
def createtables(self)
def finalize(self)
def insertdocument(self, uid, data, tags, entry)
def insertobject(self, uid, data, tags, entry)
def insertsection(self, index, uid, text, tags, entry)
def createbatch(self)
def insertbatch(self, indexids, ids, batch)
def createscores(self)
def insertscores(self, scores)
def connect(self, path=None)
def getcursor(self)
def rows(self)
def addfunctions(self)
def sqldialect(self, database, sql, parameters=None)
class Cursor:
def __init__(self, connection)
def __iter__(self)
def execute(self, statement, parameters=None)
def fetchall(self)
def fetchone(self)
@property description
Import
from txtai.database.client import Client
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| config | dict | Yes | Database configuration. Must include a content key with a SQLAlchemy connection URL or the string "client" to use the CLIENT_URL environment variable. Optional schema key for PostgreSQL schema support.
|
| path | str | No | Path parameter passed to save(); commits the current session and underlying connection.
|
| documents | list[tuple] | Yes (for insert) | List of (uid, document, tags) tuples for document insertion via ORM objects.
|
| ids | list | Yes (for delete/ids) | Document IDs for deletion or lookup operations. |
Outputs
| Name | Type | Description |
|---|---|---|
| query results | list[dict] | Query result dictionaries from the underlying RDBMS query method. |
| connection | sqlalchemy.orm.Session | The SQLAlchemy ORM session created by connect().
|
| cursor | Cursor | Custom Cursor wrapper providing DB-API compatible interface over SQLAlchemy results.
|
Usage Examples
from txtai.database.client import Client
# Connect to a PostgreSQL database
config = {
"content": "postgresql://user:password@localhost:5432/mydb",
"schema": "txtai_data"
}
db = Client(config)
# Initialize and insert documents
db.initialize()
documents = [
("doc1", {"text": "Machine learning overview", "author": "Alice"}, None),
("doc2", {"text": "Deep learning techniques", "author": "Bob"}, None),
]
db.insert(documents)
# Save (commits session)
db.save("/tmp/unused_path")
# Query results
query = {
"select": "s.id, text, score",
"where": "s.id IN ('doc1', 'doc2')",
}
results = db.query(query, limit=10, parameters=None, indexids=False)
# Close all connections and dispose engine
db.close()