Implementation:Microsoft Semantic kernel InvokePromptStreamingAsync
| Knowledge Sources | |
|---|---|
| Domains | AI_Orchestration, Real_Time_Processing |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
Concrete tool for streaming AI responses token by token provided by the Microsoft Semantic Kernel library.
Description
Kernel.InvokePromptStreamingAsync is an extension method that sends a prompt to the kernel's registered AI service and returns an IAsyncEnumerable<StreamingKernelContent> rather than a single complete result. Each element in the stream represents a chunk of the AI's response (typically one or a few tokens), allowing the caller to process and display content incrementally as it is generated.
The method follows the same execution pipeline as InvokePromptAsync (template rendering, service resolution, filter execution) but uses the streaming variant of the underlying AI service API. The returned async enumerable can be consumed with await foreach, which naturally handles the asynchronous arrival of each chunk. Cancellation is supported through both the CancellationToken parameter and the standard break statement within the loop.
Usage
Use InvokePromptStreamingAsync when building chat interfaces, real-time content displays, or any scenario where the user should see the AI response as it is being generated. This provides a significantly better user experience for long responses compared to waiting for the complete result.
Code Reference
Source Location
- Repository: semantic-kernel
- File:
dotnet/src/SemanticKernel.Core/KernelExtensions.cs:L1461-1474
Signature
public static IAsyncEnumerable<StreamingKernelContent> InvokePromptStreamingAsync(
this Kernel kernel,
string promptTemplate,
KernelArguments? arguments = null,
string? templateFormat = null,
IPromptTemplateFactory? promptTemplateFactory = null,
CancellationToken cancellationToken = default)
Import
using Microsoft.SemanticKernel;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| kernel | Kernel |
Yes | The kernel instance (implicit via extension method). |
| promptTemplate | string |
Yes | The prompt string to send to the AI service. Supports Template:$variable template syntax. |
| arguments | KernelArguments? |
No | Optional arguments for template variable substitution and execution settings. |
| templateFormat | string? |
No | Optional template format identifier. Defaults to the Semantic Kernel format. |
| promptTemplateFactory | IPromptTemplateFactory? |
No | Optional factory for the template renderer. |
| cancellationToken | CancellationToken |
No | Optional cancellation token. Can also be passed via WithCancellation() on the async enumerable. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | IAsyncEnumerable<StreamingKernelContent> |
An asynchronous stream of content chunks. Each StreamingKernelContent element contains a portion of the AI response (typically one or more tokens) along with optional metadata. Consume with await foreach. |
Usage Examples
Basic Streaming
using Microsoft.SemanticKernel;
Kernel kernel = Kernel.CreateBuilder()
.AddOpenAIChatClient(
modelId: TestConfiguration.OpenAI.ChatModelId,
apiKey: TestConfiguration.OpenAI.ApiKey)
.Build();
// Stream the response token by token
await foreach (var update in kernel.InvokePromptStreamingAsync("What color is the sky?"))
{
Console.Write(update);
}
Console.WriteLine();
Streaming with Template Variables
using Microsoft.SemanticKernel;
KernelArguments arguments = new() { { "topic", "sea" } };
await foreach (var update in kernel.InvokePromptStreamingAsync(
"What color is the {{$topic}}?", arguments))
{
Console.Write(update);
}
Console.WriteLine();
Streaming with Cancellation
using Microsoft.SemanticKernel;
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
await foreach (var update in kernel.InvokePromptStreamingAsync(
"Write a long essay about the history of the universe.",
cancellationToken: cts.Token))
{
Console.Write(update);
}
Streaming to a StringBuilder
using Microsoft.SemanticKernel;
using System.Text;
var sb = new StringBuilder();
await foreach (var update in kernel.InvokePromptStreamingAsync("Tell me a joke."))
{
sb.Append(update);
Console.Write(update); // Display in real time
}
string fullResponse = sb.ToString(); // Also capture the complete response