Principle:Infiniflow Ragflow Knowledge Base Creation

Knowledge Sources	RAGFlow RAGFlow Docs
Domains	RAG, Knowledge_Management, Data_Engineering
Last Updated	2026-02-12 06:00 GMT

Overview

A data organization pattern that creates isolated containers for document collections with associated parsing and retrieval configurations.

Description

Knowledge Base Creation is the foundational step in a Retrieval-Augmented Generation system where a named container (dataset) is established to hold documents, their parsed chunks, and associated embeddings. Each knowledge base encapsulates its own parser configuration, language settings, permissions, and search index. This enables multi-tenant isolation and allows different document collections to use different parsing strategies (e.g., academic papers vs legal documents vs general text).

In RAGFlow, a knowledge base maps to a Knowledgebase ORM model in MySQL and a dedicated partition in the document store (Elasticsearch/Infinity). The creation process validates the name, deduplicates within the tenant scope, assigns a UUID, and persists the record.

Usage

Use this principle when initializing a new document collection for RAG. This is always the first step before uploading documents, configuring parsing, or performing retrieval. Each knowledge base should represent a logically coherent set of documents (e.g., "Company Policies", "Technical Documentation", "Legal Contracts").

Theoretical Basis

The knowledge base abstraction follows the namespace isolation pattern common in information retrieval systems:

Tenant isolation: Each user's knowledge bases are scoped to their tenant ID, preventing cross-tenant data leakage
Parser binding: Each knowledge base declares a default parser type (naive, paper, book, laws, etc.) that determines how uploaded documents will be chunked
Index partitioning: Document store indices are partitioned by tenant (ragflow_{tenant_id}), with dataset_id as a sub-partition key

Related Pages

Implemented By

Implementation:Infiniflow_Ragflow_KnowledgebaseService_Create_With_Name

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment