Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:FlowiseAI Flowise Vector Store Provider Configuration

From Leeroopedia
Attribute Value
Sources packages/ui/src/api/documentstore.js
Domains Document_Store_Ingestion
Last Updated 2026-02-12 14:00 GMT

Overview

Vector_Store_Provider_Configuration is a technique for selecting and configuring vector store, embedding, and record manager providers for document storage and retrieval. This configuration step bridges the gap between processed document chunks and searchable vector embeddings.

Description

Before upserting chunks, users must configure three provider types that together form the embedding and storage pipeline:

  • Embedding provider -- Converts text chunks into dense vector representations. Examples include OpenAI Embeddings, HuggingFace Embeddings, Cohere Embeddings, and local embedding models. The embedding model determines the vector dimension and the semantic quality of similarity search.
  • Vector store provider -- Stores embeddings and supports similarity search queries. Examples include Pinecone, Chroma, Weaviate, Qdrant, Milvus, Supabase, and FAISS. Each provider has different capabilities for scaling, filtering, and persistence.
  • Record manager provider (optional) -- Tracks which documents have been indexed, enabling deduplication and incremental updates. The record manager prevents duplicate embeddings when documents are re-processed and supports cleanup modes for deleted documents.

Each provider is initialized as a node component with its own parameters, fetched dynamically from the server via dedicated API endpoints. This dynamic discovery ensures the UI always presents the current set of available providers without hardcoded lists.

Usage

Use vector store provider configuration when configuring the embedding and storage pipeline for a document store's vector search. Typical scenarios include:

  • Initial pipeline setup -- Selecting embedding model and vector store for a new document store's first upsert.
  • Provider migration -- Switching from one vector store provider to another (e.g., from Chroma to Pinecone for production scaling).
  • Embedding model upgrade -- Changing the embedding model to improve retrieval quality (requires re-embedding all chunks).
  • Incremental indexing setup -- Adding a record manager to enable efficient re-indexing of updated documents.
// Fetch all available providers
const [vectorStores, embeddings, recordManagers] = await Promise.all([
    documentStoreApi.getVectorStoreProviders(),
    documentStoreApi.getEmbeddingProviders(),
    documentStoreApi.getRecordManagerProviders()
])

console.log('Vector Stores:', vectorStores.data.map(v => v.label))
console.log('Embeddings:', embeddings.data.map(e => e.label))
console.log('Record Managers:', recordManagers.data.map(r => r.label))

Theoretical Basis

Vector store provider configuration follows a provider abstraction pattern that decouples the document pipeline from specific implementations:

  • Uniform component interface -- The system abstracts over multiple embedding and vector store implementations through a uniform component interface. Each provider declares its configuration schema via inputParams and inputAnchors, enabling the UI to render dynamic configuration forms. This abstraction means the document store workflow code does not need to know the specifics of Pinecone vs. Chroma vs. Weaviate.
  • Separation of embedding and storage concerns -- By treating embedding generation and vector storage as separate, independently configurable providers, the system enables flexible combinations: any embedding model can be paired with any vector store. This composability is critical because different use cases may require different tradeoffs (e.g., high-quality OpenAI embeddings with cost-effective local Chroma storage).
  • Record manager for incremental updates -- The optional record manager introduces a deduplication layer that tracks document hashes. When documents are re-processed, the record manager determines which chunks are new, modified, or unchanged, preventing duplicate embeddings and enabling efficient incremental re-indexing.
  • Provider lifecycle independence -- Each provider manages its own connection lifecycle, authentication, and resource allocation. The document store simply holds references to the configured providers, enabling them to be reconfigured or swapped without affecting stored chunks.
  • Credential management -- Providers that require authentication (API keys for OpenAI, connection strings for managed vector databases) integrate with the system's credential management, keeping secrets separate from configuration.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment