Implementation:Microsoft Semantic kernel Collection SearchAsync

Overview

This page documents the SearchAsync method on VectorStoreCollection<TKey, TRecord>, which performs vector similarity search against a collection of embedded records. The method accepts a query vector and returns a ranked, streaming sequence of results with similarity scores.

Source Reference

File: dotnet/samples/GettingStartedWithVectorStores/Step2_Vector_Search.cs (lines 40-49)
Type: API Doc

API Reference

Method Signature

IAsyncEnumerable<VectorSearchResult<TRecord>> SearchAsync(
    ReadOnlyMemory<float> vector,
    int top,
    VectorSearchOptions? options = null,
    CancellationToken cancellationToken = default);

Parameter	Type	Description
`vector`	`ReadOnlyMemory<float>`	The query embedding vector to search against
`top`	`int`	Maximum number of results to return
`options`	`VectorSearchOptions?`	Optional search configuration (filters, etc.)
`cancellationToken`	`CancellationToken`	Optional cancellation token

Return Type

IAsyncEnumerable<VectorSearchResult<TRecord>>

Each VectorSearchResult<TRecord> contains:

Property	Type	Description
`Record`	`TRecord`	The full record (with all data fields populated)
`Score`	`double?`	The similarity score (interpretation depends on distance metric)

Basic Search Example

The following example demonstrates a basic vector similarity search from the Semantic Kernel samples:

// Step 1: Embed the search query
var searchVector = (await embeddingGenerator.GenerateAsync(query)).Vector;

// Step 2: Search the collection
await foreach (var result in collection.SearchAsync(searchVector, top: 5))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

This pattern:

Converts the user's natural language query into a vector using the same embedding model used during ingestion
Calls SearchAsync with the query vector and a top parameter of 5
Iterates over results using await foreach (the streaming pattern for IAsyncEnumerable)
Accesses the full record via result.Record and the similarity score via result.Score

Search with Score Thresholding

You can filter results by score to ensure a minimum relevance level:

var searchVector = (await embeddingGenerator.GenerateAsync(query)).Vector;

await foreach (var result in collection.SearchAsync(searchVector, top: 10))
{
    // Only use results above a relevance threshold
    if (result.Score >= 0.7)
    {
        Console.WriteLine($"{result.Record.Term}: {result.Score}");
    }
}

Collecting Results into a List

When you need all results before processing (e.g., for sorting or aggregation):

var searchVector = (await embeddingGenerator.GenerateAsync(query)).Vector;

var results = new List<VectorSearchResult<Glossary>>();
await foreach (var result in collection.SearchAsync(searchVector, top: 5))
{
    results.Add(result);
}

// Now process all results
foreach (var result in results.OrderByDescending(r => r.Score))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

Search with Metadata Filtering

The optional VectorSearchOptions parameter supports metadata filtering:

var searchVector = (await embeddingGenerator.GenerateAsync(query)).Vector;

await foreach (var result in collection.SearchAsync(searchVector, top: 5,
    new VectorSearchOptions
    {
        Filter = g => g.Category == "AI"
    }))
{
    Console.WriteLine($"{result.Record.Term}: {result.Score}");
}

For complete details on filtering, see Implementation: VectorSearchOptions Filter.

Understanding the `top` Parameter

The top parameter controls how many results are returned:

Value	Use Case
`top: 1`	Single best match (e.g., exact concept lookup)
`top: 3-5`	RAG context window (typical for LLM augmentation)
`top: 10-20`	Exploratory search or UI display
`top: 50+`	Comprehensive retrieval or re-ranking pipelines

Increasing top improves recall (finding all relevant records) but may include less relevant results. For RAG scenarios, a value of 3-5 is typically optimal.

Understanding Similarity Scores

The Score property's interpretation depends on the distance metric configured for the collection:

Distance Metric	Score Range	"More Similar" Direction
Cosine similarity	0.0 to 1.0	Higher is more similar
Euclidean distance	0.0 to infinity	Lower is more similar
Dot product	Varies	Higher is more similar

The in-memory vector store uses cosine similarity by default, so scores closer to 1.0 indicate greater semantic similarity.

Streaming vs Buffered Consumption

The IAsyncEnumerable return type supports two consumption patterns:

Streaming (Recommended)

await foreach (var result in collection.SearchAsync(searchVector, top: 5))
{
    // Process each result as it arrives
    ProcessResult(result);
}

Buffered

var allResults = await collection.SearchAsync(searchVector, top: 5).ToListAsync();

The streaming approach is more memory-efficient and allows early termination. The buffered approach is useful when you need the complete result set for aggregation.

Error Scenarios

Scenario	Behavior
Collection does not exist	Backend-specific "not found" error
Query vector dimensions do not match stored vectors	Backend-specific dimension mismatch error
`top` is zero or negative	`ArgumentOutOfRangeException`
Empty collection	Returns empty `IAsyncEnumerable` (no results)

Relationship to Principle

This implementation page corresponds to the Vector Similarity Search principle, which explains the geometric foundations of vector search and the role of distance metrics.

Principle:Microsoft_Semantic_kernel_Vector_Similarity_Search

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment