Implementation:Microsoft Semantic kernel VectorStoreTextSearch RAG

Overview

This page documents the RAG (Retrieval Augmented Generation) integration pattern using VectorStoreTextSearch and Handlebars prompt templates in Microsoft Semantic Kernel. This pattern wraps vector store search into a kernel plugin that can be invoked from within prompt templates, enabling the LLM to receive retrieved context alongside the user's question.

Source Reference

File: dotnet/samples/Demos/VectorStoreRAG/RAGChatService.cs (lines 94-140)
Type: Pattern Doc

API Reference

Creating the Search Plugin

KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults(pluginName);

Aspect	Detail
Object	`VectorStoreTextSearch` instance
Method	`CreateWithGetTextSearchResults(string pluginName)`
Returns	`KernelPlugin` — a plugin that performs vector search and returns text results
Parameter	`pluginName` — the name used to reference this plugin in templates (e.g., `"SearchPlugin"`)

The returned plugin exposes a GetTextSearchResults function that:

Accepts a text query string
Internally generates an embedding for the query
Executes a vector similarity search against the configured collection
Returns the text content of the matching records

Using the Plugin in a Handlebars Template

The search plugin is invoked within a Handlebars prompt template using the {{#with}} helper:

{{#with (SearchPlugin-GetTextSearchResults question)}}
  {{#each this}}
    Name: {{Name}}
    Value: {{Value}}
    Link: {{Link}}
    -----------------
  {{/each}}
{{/with}}

Template Element	Description
`{{#with (SearchPlugin-GetTextSearchResults question)}}`	Invokes the search plugin with the `question` variable as the query
`{{#each this}}`	Iterates over each search result
`Template:Name`	The name/title of the result record
`Template:Value`	The text content of the result record
`Template:Link`	An optional link or reference for the result

Complete RAG Chat Implementation

The following example demonstrates the complete RAG pattern as used in the Semantic Kernel demo:

// Step 1: Create the VectorStoreTextSearch instance
var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
    collection,
    embeddingGenerator);

// Step 2: Create a search plugin from it
KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults("SearchPlugin");

// Step 3: Add the plugin to the kernel
kernel.Plugins.Add(searchPlugin);

// Step 4: Define a Handlebars prompt template with RAG context
string promptTemplate = @"
Answer the question using only the provided context.
If the context does not contain enough information, say so.

Context:
{{#with (SearchPlugin-GetTextSearchResults question)}}
  {{#each this}}
    Name: {{Name}}
    Value: {{Value}}
    -----------------
  {{/each}}
{{/with}}

Question: {{question}}
Answer:";

// Step 5: Create and invoke the prompt function
var promptFunction = kernel.CreateFunctionFromPrompt(
    new PromptTemplateConfig
    {
        Template = promptTemplate,
        TemplateFormat = "handlebars",
        Name = "RAGChat"
    },
    new HandlebarsPromptTemplateFactory());

// Step 6: Invoke with the user's question
var result = await kernel.InvokeAsync(promptFunction, new KernelArguments
{
    ["question"] = "What is Semantic Kernel?"
});

Console.WriteLine(result.GetValue<string>());

Step-by-Step Breakdown

Step 1: VectorStoreTextSearch Creation

var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
    collection,
    embeddingGenerator);

This wraps:

A typed VectorStoreCollection<TKey, TRecord> — the data source
An IEmbeddingGenerator — for converting query text to vectors

Together, they form a self-contained search unit that handles the embed-then-search pipeline internally.

Step 2: Plugin Creation

KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults("SearchPlugin");

This creates a KernelPlugin with a single function named GetTextSearchResults. The plugin name ("SearchPlugin") is used in templates as a prefix: SearchPlugin-GetTextSearchResults.

Step 3: Plugin Registration

kernel.Plugins.Add(searchPlugin);

Adding the plugin to the kernel makes it available for invocation within prompt templates and by the function calling system.

Step 4: Prompt Template Design

The prompt template is the heart of the RAG pattern. It has three sections:

System instruction: Tells the LLM to use only the provided context
Context block: The {{#with}} helper injects search results
Question and answer section: Passes the user's question and expects a response

Steps 5-6: Execution

The prompt function is created with the Handlebars template format and invoked with the user's question. At execution time:

The template engine encounters SearchPlugin-GetTextSearchResults
It calls the plugin with the question variable value
The plugin generates an embedding and searches the vector store
Results are rendered into the template
The complete prompt (with context) is sent to the chat completion model

Template Variations

Minimal RAG Template

Use the following information to answer the question.

{{#with (SearchPlugin-GetTextSearchResults question)}}
  {{#each this}}
    {{Value}}
  {{/each}}
{{/with}}

Question: {{question}}

RAG with Source Attribution

Answer the question and cite your sources using [Source: name] format.

Sources:
{{#with (SearchPlugin-GetTextSearchResults question)}}
  {{#each this}}
    [{{Name}}]: {{Value}}
  {{/each}}
{{/with}}

Question: {{question}}

RAG with Fallback Instruction

Answer the question using ONLY the provided context.
If the context does not contain relevant information, respond with:
"I don't have enough information to answer that question."

Context:
{{#with (SearchPlugin-GetTextSearchResults question)}}
  {{#each this}}
    {{Value}}
  {{/each}}
{{/with}}

Question: {{question}}

Configuring Search Behavior

The VectorStoreTextSearch instance can be configured to control search parameters:

var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
    collection,
    embeddingGenerator,
    new VectorStoreTextSearchOptions
    {
        Top = 3,  // Number of results to retrieve
    });

Parameter	Default	Description
`Top`	3	Number of search results to return and inject into the prompt

Integration with Chat History

For multi-turn conversations, the RAG pattern can be combined with chat history:

var chatHistory = new ChatHistory();
chatHistory.AddSystemMessage("You are a helpful assistant that answers questions about our glossary.");
chatHistory.AddUserMessage(userQuestion);

// The RAG template is used for the system/context, while chat history provides conversation continuity

Error Scenarios

Scenario	Behavior
No search results found	Template renders empty context; LLM should respond per fallback instruction
Embedding service unavailable	Exception during plugin execution; handle with try-catch
Collection does not exist	Exception from the search plugin's internal search call
Plugin not registered with kernel	Template rendering error (unknown helper)

Relationship to Principle

This implementation page corresponds to the RAG Chat Augmentation principle, which explains the motivation for retrieval augmented generation and the three-phase RAG pipeline.

Principle:Microsoft_Semantic_kernel_RAG_Chat_Augmentation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment