Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas KnowledgeGraph Class

From Leeroopedia


Knowledge Sources Domains Last Updated
explodinggradients/ragas LLM Evaluation, Test Data Generation, Knowledge Graphs 2026-02-10

Overview

Description

The KnowledgeGraph class, together with its supporting Node, Relationship, and NodeType types, provides the core data structure for representing documents as a structured graph in the Ragas test data generation pipeline. The KnowledgeGraph is a Python dataclass that holds lists of nodes and relationships. Nodes are Pydantic models with UUID-based identity, typed classification (DOCUMENT, CHUNK, UNKNOWN), and an extensible property dictionary. Relationships are Pydantic models connecting a source node to a target node with a type label, optional bidirectionality, and their own property dictionary.

Usage

The KnowledgeGraph is created automatically when using TestsetGenerator.generate_with_langchain_docs() or TestsetGenerator.generate_with_llamaindex_docs(). It can also be constructed manually for custom workflows. After construction, the graph is enriched via transforms and then consumed by query synthesizers to generate test questions.

Code Reference

Source Location

Component File Lines
NodeType src/ragas/testset/graph.py L23-32
Node src/ragas/testset/graph.py L35-89
Relationship src/ragas/testset/graph.py L92-142
KnowledgeGraph src/ragas/testset/graph.py L145-738

Signature

class NodeType(str, Enum):
    UNKNOWN = ""
    DOCUMENT = "document"
    CHUNK = "chunk"

class Node(BaseModel):
    id: uuid.UUID = Field(default_factory=uuid.uuid4)
    properties: dict = Field(default_factory=dict)
    type: NodeType = NodeType.UNKNOWN

class Relationship(BaseModel):
    id: uuid.UUID = Field(default_factory=uuid.uuid4)
    type: str
    source: Node
    target: Node
    bidirectional: bool = False
    properties: dict = Field(default_factory=dict)

@dataclass
class KnowledgeGraph:
    nodes: List[Node] = field(default_factory=list)
    relationships: List[Relationship] = field(default_factory=list)

Import

from ragas.testset.graph import KnowledgeGraph, Node, Relationship, NodeType

Key Methods

Method Signature Description
add add(item: Union[Node, Relationship]) -> None Adds a node or relationship to the graph. Raises ValueError for invalid types.
save save(path: Union[str, Path]) -> None Serializes the graph to a JSON file at the given path using UTF-8 encoding.
load load(path: Union[str, Path]) -> KnowledgeGraph Class method that deserializes a graph from a JSON file, reconstructing node references in relationships.
get_node_by_id get_node_by_id(node_id: Union[UUID, str]) -> Optional[Node] Retrieves a node by its UUID.
find_indirect_clusters find_indirect_clusters(relationship_condition=lambda _: True, depth_limit=3) -> List[Set[Node]] Finds clusters of indirectly connected nodes using the Leiden community detection algorithm.
find_n_indirect_clusters find_n_indirect_clusters(n, relationship_condition=lambda _: True, depth_limit=3) -> List[Set[Node]] Returns up to n indirect clusters using DFS-based path exploration with diversity optimization.
remove_node remove_node(node: Node, inplace: bool = True) -> Optional[KnowledgeGraph] Removes a node and its associated relationships from the graph.
find_two_nodes_single_rel find_two_nodes_single_rel(relationship_condition=lambda _: True) -> List[Tuple[Node, Relationship, Node]] Finds (NodeA, Relationship, NodeB) triples based on a relationship condition.

I/O Contract

Node

Parameter Type Default Description
id uuid.UUID auto-generated Unique identifier for the node
properties dict {} Extensible key-value property store (keys are case-insensitive)
type NodeType NodeType.UNKNOWN Classification of the node (DOCUMENT, CHUNK, UNKNOWN)

Relationship

Parameter Type Default Description
id uuid.UUID auto-generated Unique identifier for the relationship
type str (required) The type label of the relationship (e.g., "child", "similar")
source Node (required) The source node
target Node (required) The target node
bidirectional bool False Whether the relationship is symmetric
properties dict {} Extensible key-value property store

KnowledgeGraph.save / KnowledgeGraph.load

Direction Type Description
Input (save) Union[str, Path] File system path for the output JSON file
Output (save) JSON file Serialized graph with nodes and relationships arrays
Input (load) Union[str, Path] File system path to an existing JSON file
Output (load) KnowledgeGraph Reconstructed graph with fully resolved node references

Usage Examples

Creating a Knowledge Graph Manually

from ragas.testset.graph import KnowledgeGraph, Node, Relationship, NodeType

# Create nodes
doc_node = Node(
    type=NodeType.DOCUMENT,
    properties={
        "page_content": "Machine learning is a subset of artificial intelligence.",
        "document_metadata": {"source": "ml_intro.pdf"},
    },
)

chunk_node = Node(
    type=NodeType.CHUNK,
    properties={
        "page_content": "Supervised learning uses labeled data.",
        "document_metadata": {"source": "ml_intro.pdf", "chunk_id": 0},
    },
)

# Create a relationship
rel = Relationship(
    type="child",
    source=doc_node,
    target=chunk_node,
)

# Build the graph
kg = KnowledgeGraph()
kg.add(doc_node)
kg.add(chunk_node)
kg.add(rel)

print(kg)
# KnowledgeGraph(nodes: 2, relationships: 1)

Saving and Loading a Knowledge Graph

from ragas.testset.graph import KnowledgeGraph

# Save to disk
kg.save("my_knowledge_graph.json")

# Load from disk
loaded_kg = KnowledgeGraph.load("my_knowledge_graph.json")
print(loaded_kg)
# KnowledgeGraph(nodes: 2, relationships: 1)

Retrieving a Node by ID

node = kg.get_node_by_id(doc_node.id)
print(node.get_property("page_content"))
# "Machine learning is a subset of artificial intelligence."

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment