Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft Semantic kernel VectorSearchOptions Filter

From Leeroopedia

Overview

This page documents the VectorSearchOptions class and its Filter property, which enable metadata filtering on vector similarity search operations. Filters are expressed as C# lambda expressions over indexed data fields, allowing type-safe, compile-time-validated predicates that narrow search results.

Source Reference

  • File: dotnet/samples/GettingStartedWithVectorStores/Step2_Vector_Search.cs (lines 65-71)
  • Type: API Doc

API Reference

VectorSearchOptions

Property Type Description
Filter Expression<Func<TRecord, bool>>? Lambda expression predicate applied to indexed fields

The VectorSearchOptions object is passed as the third parameter to SearchAsync:

collection.SearchAsync(vector, top, new VectorSearchOptions { Filter = predicate });

Basic Filtering Example

The following example from the Semantic Kernel samples demonstrates filtering search results by a category field:

var searchVector = (await embeddingGenerator.GenerateAsync(query)).Vector;

await foreach (var result in collection.SearchAsync(searchVector, top: 5,
    new VectorSearchOptions
    {
        Filter = g => g.Category == "AI"
    }))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

This search:

  1. Embeds the query text into a vector
  2. Searches the collection for the top 5 most similar records
  3. Only considers records where Category equals "AI"
  4. Returns results ranked by similarity score within the filtered subset

Prerequisite: Indexed Fields

Filters only work on fields marked with IsIndexed = true in the data model:

internal sealed class Glossary
{
    [VectorStoreKey]
    public string Key { get; set; }

    [VectorStoreData(IsIndexed = true)]   // Can be used in filters
    public string Category { get; set; }

    [VectorStoreData]                      // Cannot be used in filters
    public string Term { get; set; }

    [VectorStoreData]                      // Cannot be used in filters
    public string Definition { get; set; }

    [VectorStoreVector(Dimensions: 1536)]
    public ReadOnlyMemory<float> DefinitionEmbedding { get; set; }
}

In this model, only Category can be used in filter expressions because it is the only field with IsIndexed = true.

Filter Expression Examples

Equality Filter

new VectorSearchOptions
{
    Filter = g => g.Category == "AI"
}

Returns only records where Category is exactly "AI".

Inequality Filter

new VectorSearchOptions
{
    Filter = g => g.Category != "Deprecated"
}

Excludes records where Category is "Deprecated".

Compound AND Filter

new VectorSearchOptions
{
    Filter = g => g.Category == "AI" && g.IsPublished == true
}

Returns records where both conditions are true. Both Category and IsPublished must be indexed.

Compound OR Filter

new VectorSearchOptions
{
    Filter = g => g.Category == "AI" || g.Category == "ML"
}

Returns records matching either category.

Numeric Range Filter

For data models with indexed numeric fields:

new VectorSearchOptions
{
    Filter = g => g.Year >= 2022 && g.Year <= 2024
}

Returns records within the specified year range.

Search Without Filtering (Comparison)

For reference, a search without filtering:

// No filter — searches all records in the collection
await foreach (var result in collection.SearchAsync(searchVector, top: 5))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

vs. with filtering:

// With filter — only searches records matching the predicate
await foreach (var result in collection.SearchAsync(searchVector, top: 5,
    new VectorSearchOptions { Filter = g => g.Category == "AI" }))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

The key difference is that the filtered search considers only a subset of the collection, potentially returning different (and more relevant) results.

Pre-Filter Behavior

The filter is applied as a pre-filter, meaning:

Scenario Result
100 records in collection, 20 match filter, top: 5 Returns top 5 from the 20 matching records
100 records in collection, 3 match filter, top: 5 Returns only 3 results (fewer than top)
100 records in collection, 0 match filter, top: 5 Returns empty result set

The top parameter specifies the maximum number of results to return from the filtered set, not from the full collection.

Error Scenarios

Scenario Behavior
Filter references a non-indexed field Backend-specific error (field not filterable)
Filter references a non-existent property Compile-time error (C# compiler rejects the lambda)
Type mismatch in comparison Compile-time error (e.g., comparing string to int)
Unsupported expression type Runtime error from the connector's expression translator

Backend Compatibility

The lambda expression is translated into backend-specific filter syntax by each connector:

Backend Filter Translation Target
InMemory Direct C# predicate evaluation
Azure AI Search OData $filter expression
Qdrant JSON filter payload
Pinecone Metadata filter object
Weaviate GraphQL where clause

Simple equality and comparison filters are supported by all backends. Complex expressions (e.g., string functions, nested logic) may have varying support.

Relationship to Principle

This implementation page corresponds to the Metadata Filtering principle, which explains the rationale for combining structured predicates with vector similarity search.

Principle:Microsoft_Semantic_kernel_Metadata_Filtering

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment