Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Microsoft Semantic kernel Vector Store Collection Setup

From Leeroopedia

Overview

The Vector Store Collection Setup principle describes how Semantic Kernel organizes vector data into typed collections accessed through a provider-agnostic abstraction layer. A vector store instance acts as a factory for collections, and each collection is parameterized with a key type and a record type, ensuring type safety from creation through querying.

This two-layer architecture — store then collection — mirrors how relational databases separate the database connection from individual tables, but with the added benefit of generic type parameters that enforce schema correctness at compile time.

Motivation

Applications that use vector stores need to manage multiple concerns:

  • Backend selection: Choosing between in-memory stores for development, Azure AI Search for production, Qdrant for self-hosted deployments, and so on
  • Collection lifecycle: Creating collections (tables/indexes) if they do not exist, or connecting to existing ones
  • Type safety: Ensuring that records written to a collection match the expected schema and that query results are correctly typed
  • Swappability: Allowing the backend to change without modifying application logic

The Vector Store Collection Setup principle addresses all of these concerns through a layered abstraction.

Core Concepts

The Vector Store as a Factory

A vector store instance (such as InMemoryVectorStore, AzureAISearchVectorStore, or QdrantVectorStore) does not directly hold data. Instead, it serves as a factory that produces typed collection instances. This factory pattern means:

  • The store encapsulates backend-specific connection details (endpoints, credentials, client options)
  • Collections are created through a uniform GetCollection method regardless of backend
  • Multiple collections with different record types can coexist within a single store

Typed Collection Access

The GetCollection<TKey, TRecord>(collectionName) method returns a VectorStoreCollection<TKey, TRecord> that is fully typed. This means:

  1. The key type (TKey) determines what kind of identifiers the collection accepts (e.g., string, Guid)
  2. The record type (TRecord) must be a class decorated with vector store attributes ([VectorStoreKey], [VectorStoreData], [VectorStoreVector])
  3. All operations on the collection — upsert, get, search, delete — are type-safe

Collection Lifecycle Management

Before data can be ingested, the collection must exist in the underlying store. The EnsureCollectionExistsAsync() method provides idempotent collection creation:

  • If the collection does not exist, it creates the collection (and any necessary indexes) based on the record type's attributes
  • If the collection already exists, it does nothing
  • This makes the method safe to call on every application startup without risking data loss

Design Principles

Provider Abstraction

The collection setup API is identical across all supported backends. Application code that creates a collection and ingests data can switch from InMemoryVectorStore to AzureAISearchVectorStore by changing only the store instantiation line. The collection operations (GetCollection, EnsureCollectionExistsAsync, UpsertAsync, SearchAsync) remain the same.

Generic Type Safety

By parameterizing collections with <TKey, TRecord>, the compiler prevents common errors:

  • Upserting a record of the wrong type into a collection
  • Using a key of the wrong type for retrieval
  • Accessing properties that do not exist on the record type in search results

Lazy Initialization

GetCollection does not create the collection in the backend — it only creates the client-side collection object. Actual backend resources are created when EnsureCollectionExistsAsync() is called. This lazy approach allows the application to configure multiple collections before incurring any network or storage costs.

Typical Setup Flow

The standard setup flow follows these steps:

  1. Instantiate the vector store with backend-specific configuration
  2. Get a typed collection by specifying the key type, record type, and collection name
  3. Ensure the collection exists in the backend
  4. Proceed with data operations (upsert, search)

This flow is consistent across all backends and is the recommended pattern for both development and production scenarios.

Relationship to Other Principles

Implementation:Microsoft_Semantic_kernel_InMemoryVectorStore

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment