Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Neuml Txtai PgVector ANN

From Leeroopedia


Knowledge Sources
Domains Vector_Search, ANN
Last Updated 2026-02-10 01:00 GMT

Overview

Concrete ANN backend for PostgreSQL-based vector similarity search using the pgvector extension, provided by txtai.

Description

PGVector is an ANN implementation that builds approximate nearest neighbor indexes backed by a PostgreSQL database with the pgvector extension. It stores embeddings in a database table and creates an HNSW index for efficient similarity search using inner product distance. The class supports full 32-bit float vectors (VECTOR), half-precision 16-bit vectors (HALFVEC), and binary bit vectors (BIT) with hamming distance scoring. It uses SQLAlchemy for database connectivity and session management.

Usage

Use the PGVector backend when you need persistent, database-backed vector search with PostgreSQL. Select this backend by setting the ANN backend configuration to "pgvector". Requires the pgvector and sqlalchemy Python packages, installed via the txtai "ann" extra. The database URL is configured via the url setting or the ANN_URL environment variable.

Code Reference

Source Location

  • Repository: Neuml_Txtai
  • File: src/python/txtai/ann/dense/pgvector.py
  • Lines: 1-324

Signature

class PGVector(ANN):
    """Builds an ANN index backed by a Postgres database."""

    def __init__(self, config)
    def load(self, path)
    def index(self, embeddings)
    def append(self, embeddings)
    def delete(self, ids)
    def search(self, queries, limit)
    def count(self)
    def save(self, path)
    def close(self)
    def initialize(self, recreate=False)
    def createindex(self)
    def connect(self)
    def schema(self)
    def settings(self)
    def sqldialect(self, sql, parameters=None)
    def defaulttable(self)
    def url(self)
    def column(self)
    def operation(self)
    def prepare(self, data)
    def query(self, query)
    def score(self, score)

Import

from txtai.ann import ANNFactory

I/O Contract

Inputs

Name Type Required Description
config dict Yes ANN configuration dictionary containing backend settings
config["backend"] str Yes Must be set to "pgvector" to select this backend
url str No PostgreSQL connection URL (falls back to ANN_URL environment variable)
table str No Database table name (default: "vectors")
schema str No Database schema name (optional)
m int No HNSW M parameter controlling number of connections (default: 16)
efconstruction int No HNSW ef_construction parameter (default: 200)
quantize int No Scalar quantization bit width for BIT vectors
precision str No Set to "half" for 16-bit HALFVEC storage

Outputs

Name Type Description
search() returns list List of lists of (id, score) tuples, one list per query
count() returns int Number of rows in the vectors table
save() side-effect commit Commits the current database session and connection

Usage Examples

from txtai import Embeddings

# Create embeddings with PGVector backend
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "backend": "pgvector",
    "pgvector": {
        "url": "postgresql://user:pass@localhost/dbname",
        "table": "embeddings",
        "m": 16,
        "efconstruction": 200
    }
})

# Index data
embeddings.index([
    "US tops 5 million confirmed virus cases",
    "Canada's last intact ice shelf has broken up",
    "Beijing urges strong action on climate change",
    "New York battles severe winter storm"
])

# Search
results = embeddings.search("climate change effects", 2)
print(results)
# Using PGVector with half-precision storage
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "backend": "pgvector",
    "pgvector": {
        "url": "postgresql://user:pass@localhost/dbname",
        "precision": "half"
    }
})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment