Implementation:Microsoft Semantic kernel InvokePromptStreamingAsync

Knowledge Sources	Semantic Kernel Semantic Kernel Docs
Domains	AI_Orchestration, Real_Time_Processing
Last Updated	2026-02-11 19:00 GMT

Overview

Concrete tool for streaming AI responses token by token provided by the Microsoft Semantic Kernel library.

Description

Kernel.InvokePromptStreamingAsync is an extension method that sends a prompt to the kernel's registered AI service and returns an IAsyncEnumerable<StreamingKernelContent> rather than a single complete result. Each element in the stream represents a chunk of the AI's response (typically one or a few tokens), allowing the caller to process and display content incrementally as it is generated.

The method follows the same execution pipeline as InvokePromptAsync (template rendering, service resolution, filter execution) but uses the streaming variant of the underlying AI service API. The returned async enumerable can be consumed with await foreach, which naturally handles the asynchronous arrival of each chunk. Cancellation is supported through both the CancellationToken parameter and the standard break statement within the loop.

Usage

Use InvokePromptStreamingAsync when building chat interfaces, real-time content displays, or any scenario where the user should see the AI response as it is being generated. This provides a significantly better user experience for long responses compared to waiting for the complete result.

Code Reference

Source Location

Repository: semantic-kernel
File: dotnet/src/SemanticKernel.Core/KernelExtensions.cs:L1461-1474

Signature

public static IAsyncEnumerable<StreamingKernelContent> InvokePromptStreamingAsync(
    this Kernel kernel,
    string promptTemplate,
    KernelArguments? arguments = null,
    string? templateFormat = null,
    IPromptTemplateFactory? promptTemplateFactory = null,
    CancellationToken cancellationToken = default)

Import

using Microsoft.SemanticKernel;

I/O Contract

Inputs

Name	Type	Required	Description
kernel	`Kernel`	Yes	The kernel instance (implicit via extension method).
promptTemplate	`string`	Yes	The prompt string to send to the AI service. Supports Template:$variable template syntax.
arguments	`KernelArguments?`	No	Optional arguments for template variable substitution and execution settings.
templateFormat	`string?`	No	Optional template format identifier. Defaults to the Semantic Kernel format.
promptTemplateFactory	`IPromptTemplateFactory?`	No	Optional factory for the template renderer.
cancellationToken	`CancellationToken`	No	Optional cancellation token. Can also be passed via WithCancellation() on the async enumerable.

Outputs

Name	Type	Description
return	`IAsyncEnumerable<StreamingKernelContent>`	An asynchronous stream of content chunks. Each StreamingKernelContent element contains a portion of the AI response (typically one or more tokens) along with optional metadata. Consume with await foreach.

Usage Examples

Basic Streaming

using Microsoft.SemanticKernel;

Kernel kernel = Kernel.CreateBuilder()
    .AddOpenAIChatClient(
        modelId: TestConfiguration.OpenAI.ChatModelId,
        apiKey: TestConfiguration.OpenAI.ApiKey)
    .Build();

// Stream the response token by token
await foreach (var update in kernel.InvokePromptStreamingAsync("What color is the sky?"))
{
    Console.Write(update);
}
Console.WriteLine();

Streaming with Template Variables

using Microsoft.SemanticKernel;

KernelArguments arguments = new() { { "topic", "sea" } };

await foreach (var update in kernel.InvokePromptStreamingAsync(
    "What color is the {{$topic}}?", arguments))
{
    Console.Write(update);
}
Console.WriteLine();

Streaming with Cancellation

using Microsoft.SemanticKernel;

var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));

await foreach (var update in kernel.InvokePromptStreamingAsync(
    "Write a long essay about the history of the universe.",
    cancellationToken: cts.Token))
{
    Console.Write(update);
}

Streaming to a StringBuilder

using Microsoft.SemanticKernel;
using System.Text;

var sb = new StringBuilder();

await foreach (var update in kernel.InvokePromptStreamingAsync("Tell me a joke."))
{
    sb.Append(update);
    Console.Write(update);  // Display in real time
}

string fullResponse = sb.ToString();  // Also capture the complete response

Related Pages

Implements Principle

Principle:Microsoft_Semantic_kernel_Streaming_Response

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment