Implementation:Neuml Txtai Graph Base

Knowledge Sources	Neuml_Txtai
Domains	Graph_Networks, Knowledge_Graph
Last Updated	2026-02-09 17:00 GMT

Overview

Graph is the abstract base class for graph network backends in txtai, providing node/edge management, relationship inference, topic modeling via community detection, and subgraph filtering.

Description

The Graph class defines the interface and shared logic for all graph backends in txtai. It manages a graph network where nodes represent indexed documents and edges represent relationships inferred from vector similarity scores or manually provided relationship data. The class supports topic modeling through community detection algorithms (delegated to the Topics helper), with optional category labeling via a similarity function. Concrete subclasses (e.g., NetworkX, igraph) must implement the abstract methods for node/edge operations, graph algorithms (centrality, pagerank, shortest path), graph queries, community detection, and persistence.

The class handles the full graph lifecycle: inserting document nodes with text/object data and optional custom attributes, building edges via batch similarity search, upserting new nodes into an existing graph, filtering subgraphs, and managing topic/category assignments on nodes.

Usage

Use Graph (through a concrete subclass) when you need to build and query a knowledge graph from your txtai embeddings index. It enables relationship discovery between documents, topic modeling, graph-based search, and visualization. Configure the topics key in the graph config to enable community detection with optional category labeling.

Code Reference

Source Location

Repository: Neuml_Txtai
File: src/python/txtai/graph/base.py
Lines: 1-769

Signature

class Graph:
    """
    Base class for Graph instances. This class builds graph networks. Supports topic modeling
    and relationship traversal.
    """

    def __init__(self, config):
        """
        Creates a new Graph.

        Args:
            config: graph configuration
        """

        # Graph configuration
        self.config = config if config is not None else {}

        # Graph backend
        self.backend = None

        # Topic modeling
        self.categories = None
        self.topics = None

        # Transform columns
        columns = config.get("columns", {})
        self.text = columns.get("text", "text")
        self.object = columns.get("object", "object")

        # Attributes to copy
        self.copyattributes = config.get("copyattributes", False)

        # Relationships are manually-provided edges
        self.relationships = columns.get("relationships", "relationships")
        self.relations = {}

Import

from txtai.graph import Graph

I/O Contract

Inputs

Name	Type	Required	Description
config	dict	Yes	Graph configuration dictionary containing optional keys: `columns` (text, object, relationships mappings), `copyattributes` (bool or list), `topics` (topic modeling config with optional `categories`), `batchsize` (int, default 256), `limit` (int, default 15), `minscore` (float, default 0.1), `approximate` (bool, default True)

Outputs

Name	Type	Description
self.backend	object	Graph backend instance (type depends on concrete subclass, e.g., NetworkX Graph)
self.topics	dict or None	Mapping of topic name to list of node ids belonging to that topic
self.categories	list or None	List of category labels corresponding to each topic (same order as topics)
self.relations	dict	Temporary storage for manually-provided relationships before they are resolved to edges

Key Methods

insert(self, documents, index=0)

Inserts graph nodes for a batch of documents. Each document (uid, data, tags) produces a node with id and data attributes. For dict documents, the text/object field is extracted, custom attributes are copied based on copyattributes, and relationship data is stored for later resolution. The index parameter is the starting node id.

delete(self, ids)

Removes nodes and their edges from the graph. Also removes deleted nodes from topic lists and cleans up empty topics.

index(self, search, ids, similarity)

Builds the full graph network. Resolves manually-provided relationship edges, infers edges for all nodes using the batch search function, and optionally runs topic modeling with community detection and category labeling.

upsert(self, search, ids, similarity=None)

Incrementally updates the graph for new/modified nodes. Resolves relationships, infers edges only for nodes with the data attribute (new nodes), and either infers topics from neighboring nodes or rebuilds topics entirely.

filter(self, nodes, graph=None)

Creates a subgraph containing only the specified nodes and their interconnecting edges. Copies node attributes, adds optional score attributes, and filters topics/categories to match the selected nodes. Returns a new graph instance.

addrelations(self, node, relations)

Stores manually-provided relationships for a node. Each relation can be a string id or a dict with id and optional attributes like weight.

inferedges(self, nodes, search, attributes=None)

Iterates through nodes in configurable batch sizes, runs the search function on node data to find similar nodes, and adds edges where similarity exceeds minscore. Nodes with existing edges are skipped in approximate mode.

addtopics(self, similarity=None)

Runs community detection via the Topics helper class, optionally labels each community with a category using the similarity function, and adds topic, topicrank, and category attributes to each node.

cleartopics(self)

Removes all topic-related attributes (topic, topicrank, category) from every node and resets the topics and categories to None.

infertopics(self)

Assigns topics to new nodes (marked with the updated attribute) by analyzing their neighbors' topics and categories using majority voting via Counter.most_common().

Abstract Methods (Must Be Implemented by Subclasses)

Method	Description
create()	Creates the graph network backend
count()	Returns total number of nodes
scan(attribute, data)	Iterates over nodes matching optional criteria
node(node)	Gets node attributes by id
addnode(node, **attrs)	Adds a single node
addnodes(nodes)	Adds multiple nodes
removenode(node)	Removes a node and its edges
hasnode(node)	Checks if a node exists
attribute(node, field)	Gets a node attribute value
addattribute(node, field, value)	Sets a node attribute
removeattribute(node, field)	Removes a node attribute
edgecount()	Returns total number of edges
edges(node)	Gets edges for a node
addedge(source, target, **attrs)	Adds a single edge
addedges(edges)	Adds multiple edges
hasedge(source, target)	Checks if an edge exists
centrality()	Runs centrality algorithm
pagerank()	Runs PageRank algorithm
showpath(source, target)	Finds shortest path
isquery(queries)	Validates graph queries
parse(query)	Parses a graph query
search(query, limit, graph)	Executes a graph search
communities(config)	Runs community detection
load(path)	Loads graph from file
save(path)	Saves graph to file
loaddict(data)	Loads graph from dictionary
savedict()	Saves graph to dictionary

Usage Examples

Basic Usage

from txtai import Embeddings

# Create embeddings with graph support
embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "content": True,
    "graph": {
        "limit": 15,
        "minscore": 0.2,
        "batchsize": 256,
        "approximate": True,
        "topics": {
            "categories": ["science", "technology", "health", "business"]
        }
    }
})

# Index documents - graph nodes and edges are built automatically
documents = [
    ("doc1", {"text": "Deep learning for image recognition"}, None),
    ("doc2", {"text": "Neural networks in computer vision"}, None),
    ("doc3", {"text": "Transformers for NLP tasks"}, None),
    ("doc4", {"text": "Stock market prediction models"}, None),
]
embeddings.index([(uid, doc, tags) for uid, doc, tags in documents])

# Access the graph
graph = embeddings.graph

# Get node count
print(graph.count())

# Search the graph
results = graph.search("deep learning", limit=5)

# Get topics
if graph.topics:
    for topic, node_ids in graph.topics.items():
        print(f"Topic: {topic}, Nodes: {len(node_ids)}")

# Filter to a subgraph
subgraph = graph.filter([0, 1, 2])
print(subgraph.count())  # 3

Related Pages

Principle:Neuml_Txtai_Graph_Network

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment