Implementation:Microsoft Semantic kernel VectorStoreTextSearch RAG
Overview
This page documents the RAG (Retrieval Augmented Generation) integration pattern using VectorStoreTextSearch and Handlebars prompt templates in Microsoft Semantic Kernel. This pattern wraps vector store search into a kernel plugin that can be invoked from within prompt templates, enabling the LLM to receive retrieved context alongside the user's question.
Source Reference
- File:
dotnet/samples/Demos/VectorStoreRAG/RAGChatService.cs(lines 94-140) - Type: Pattern Doc
API Reference
Creating the Search Plugin
KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults(pluginName);
| Aspect | Detail |
|---|---|
| Object | VectorStoreTextSearch instance
|
| Method | CreateWithGetTextSearchResults(string pluginName)
|
| Returns | KernelPlugin — a plugin that performs vector search and returns text results
|
| Parameter | pluginName — the name used to reference this plugin in templates (e.g., "SearchPlugin")
|
The returned plugin exposes a GetTextSearchResults function that:
- Accepts a text query string
- Internally generates an embedding for the query
- Executes a vector similarity search against the configured collection
- Returns the text content of the matching records
Using the Plugin in a Handlebars Template
The search plugin is invoked within a Handlebars prompt template using the {{#with}} helper:
{{#with (SearchPlugin-GetTextSearchResults question)}}
{{#each this}}
Name: {{Name}}
Value: {{Value}}
Link: {{Link}}
-----------------
{{/each}}
{{/with}}
| Template Element | Description |
|---|---|
{{#with (SearchPlugin-GetTextSearchResults question)}} |
Invokes the search plugin with the question variable as the query
|
{{#each this}} |
Iterates over each search result |
Template:Name |
The name/title of the result record |
Template:Value |
The text content of the result record |
Template:Link |
An optional link or reference for the result |
Complete RAG Chat Implementation
The following example demonstrates the complete RAG pattern as used in the Semantic Kernel demo:
// Step 1: Create the VectorStoreTextSearch instance
var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
collection,
embeddingGenerator);
// Step 2: Create a search plugin from it
KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults("SearchPlugin");
// Step 3: Add the plugin to the kernel
kernel.Plugins.Add(searchPlugin);
// Step 4: Define a Handlebars prompt template with RAG context
string promptTemplate = @"
Answer the question using only the provided context.
If the context does not contain enough information, say so.
Context:
{{#with (SearchPlugin-GetTextSearchResults question)}}
{{#each this}}
Name: {{Name}}
Value: {{Value}}
-----------------
{{/each}}
{{/with}}
Question: {{question}}
Answer:";
// Step 5: Create and invoke the prompt function
var promptFunction = kernel.CreateFunctionFromPrompt(
new PromptTemplateConfig
{
Template = promptTemplate,
TemplateFormat = "handlebars",
Name = "RAGChat"
},
new HandlebarsPromptTemplateFactory());
// Step 6: Invoke with the user's question
var result = await kernel.InvokeAsync(promptFunction, new KernelArguments
{
["question"] = "What is Semantic Kernel?"
});
Console.WriteLine(result.GetValue<string>());
Step-by-Step Breakdown
Step 1: VectorStoreTextSearch Creation
var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
collection,
embeddingGenerator);
This wraps:
- A typed
VectorStoreCollection<TKey, TRecord>— the data source - An
IEmbeddingGenerator— for converting query text to vectors
Together, they form a self-contained search unit that handles the embed-then-search pipeline internally.
Step 2: Plugin Creation
KernelPlugin searchPlugin = vectorStoreTextSearch.CreateWithGetTextSearchResults("SearchPlugin");
This creates a KernelPlugin with a single function named GetTextSearchResults. The plugin name ("SearchPlugin") is used in templates as a prefix: SearchPlugin-GetTextSearchResults.
Step 3: Plugin Registration
kernel.Plugins.Add(searchPlugin);
Adding the plugin to the kernel makes it available for invocation within prompt templates and by the function calling system.
Step 4: Prompt Template Design
The prompt template is the heart of the RAG pattern. It has three sections:
- System instruction: Tells the LLM to use only the provided context
- Context block: The
{{#with}}helper injects search results - Question and answer section: Passes the user's question and expects a response
Steps 5-6: Execution
The prompt function is created with the Handlebars template format and invoked with the user's question. At execution time:
- The template engine encounters
SearchPlugin-GetTextSearchResults - It calls the plugin with the
questionvariable value - The plugin generates an embedding and searches the vector store
- Results are rendered into the template
- The complete prompt (with context) is sent to the chat completion model
Template Variations
Minimal RAG Template
Use the following information to answer the question.
{{#with (SearchPlugin-GetTextSearchResults question)}}
{{#each this}}
{{Value}}
{{/each}}
{{/with}}
Question: {{question}}
RAG with Source Attribution
Answer the question and cite your sources using [Source: name] format.
Sources:
{{#with (SearchPlugin-GetTextSearchResults question)}}
{{#each this}}
[{{Name}}]: {{Value}}
{{/each}}
{{/with}}
Question: {{question}}
RAG with Fallback Instruction
Answer the question using ONLY the provided context.
If the context does not contain relevant information, respond with:
"I don't have enough information to answer that question."
Context:
{{#with (SearchPlugin-GetTextSearchResults question)}}
{{#each this}}
{{Value}}
{{/each}}
{{/with}}
Question: {{question}}
Configuring Search Behavior
The VectorStoreTextSearch instance can be configured to control search parameters:
var vectorStoreTextSearch = new VectorStoreTextSearch<Glossary>(
collection,
embeddingGenerator,
new VectorStoreTextSearchOptions
{
Top = 3, // Number of results to retrieve
});
| Parameter | Default | Description |
|---|---|---|
Top |
3 | Number of search results to return and inject into the prompt |
Integration with Chat History
For multi-turn conversations, the RAG pattern can be combined with chat history:
var chatHistory = new ChatHistory();
chatHistory.AddSystemMessage("You are a helpful assistant that answers questions about our glossary.");
chatHistory.AddUserMessage(userQuestion);
// The RAG template is used for the system/context, while chat history provides conversation continuity
Error Scenarios
| Scenario | Behavior |
|---|---|
| No search results found | Template renders empty context; LLM should respond per fallback instruction |
| Embedding service unavailable | Exception during plugin execution; handle with try-catch |
| Collection does not exist | Exception from the search plugin's internal search call |
| Plugin not registered with kernel | Template rendering error (unknown helper) |
Relationship to Principle
This implementation page corresponds to the RAG Chat Augmentation principle, which explains the motivation for retrieval augmented generation and the three-phase RAG pipeline.
Principle:Microsoft_Semantic_kernel_RAG_Chat_Augmentation
See Also
- Principle: RAG Chat Augmentation
- Implementation: Collection SearchAsync
- Implementation: IEmbeddingGenerator GenerateAsync
- Implementation: VectorSearchOptions Filter