Implementation:Langgenius Dify CreateEmptyDataset
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| Dify | RAG, Knowledge_Management, Frontend | 2026-02-12 00:00 GMT |
Overview
Description
createEmptyDataset is a frontend service function that creates a new, empty knowledge base (dataset) in Dify. It issues a POST request to the /datasets endpoint with only a name, and the backend provisions the full dataset resource with default configuration for indexing, permissions, embedding model, and retrieval model.
This function represents the simplest entry point into the Knowledge Base Management workflow. Once the dataset is created, documents can be uploaded, chunked, indexed, and queried within it.
Usage
- Call
createEmptyDatasetwhen the user initiates the "Create Knowledge Base" action from the UI. - The returned
DataSetobject provides theidneeded to add documents, configure settings, or attach the dataset to applications. - Typically followed by a call to
createFirstDocumentorcreateDocumentto populate the newly created dataset.
Code Reference
Source Location
web/service/datasets.ts, lines 84--86.
Signature
export const createEmptyDataset = ({ name }: { name: string }): Promise<DataSet> => {
return post<DataSet>('/datasets', { body: { name } })
}
Import
import { createEmptyDataset } from '@/service/datasets'
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
name |
string |
Yes | Human-readable name for the new dataset. |
Outputs
Returns Promise<DataSet>. Key fields of the DataSet type:
| Field | Type | Description |
|---|---|---|
id |
string |
Unique identifier for the created dataset. |
name |
string |
Name of the dataset as provided in the request. |
indexing_status |
DocumentIndexingStatus |
Current indexing status of the dataset. |
permission |
DatasetPermission |
Access control level: only_me, all_team_members, or partial_members.
|
doc_form |
ChunkingMode |
Default chunking mode: text_model, qa_model, or hierarchical_model.
|
runtime_mode |
'rag_pipeline' ¦ 'general' |
Whether the dataset operates as a standard knowledge base or a RAG pipeline. |
embedding_model |
string |
Name of the embedding model assigned to the dataset. |
embedding_model_provider |
string |
Provider of the embedding model. |
retrieval_model |
RetrievalConfig |
Default retrieval configuration including search method, top_k, and score threshold. |
Usage Examples
Creating a new knowledge base
import { createEmptyDataset } from '@/service/datasets'
const dataset = await createEmptyDataset({ name: 'Product Documentation' })
console.log(dataset.id) // Use this ID for subsequent document uploads
Creating a dataset and immediately adding a document
import { createEmptyDataset, createDocument } from '@/service/datasets'
const dataset = await createEmptyDataset({ name: 'FAQ Knowledge Base' })
const docResponse = await createDocument({
datasetId: dataset.id,
body: {
data_source: { type: 'upload_file', info_list: { data_source_type: 'upload_file', file_info_list: { file_ids: [fileId] } } },
doc_form: 'text_model',
doc_language: 'English',
process_rule: processRule,
retrieval_model: dataset.retrieval_model,
embedding_model: dataset.embedding_model,
embedding_model_provider: dataset.embedding_model_provider,
},
})