Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft Semantic kernel InvokePromptStreamingAsync

From Leeroopedia
Knowledge Sources
Domains AI_Orchestration, Real_Time_Processing
Last Updated 2026-02-11 19:00 GMT

Overview

Concrete tool for streaming AI responses token by token provided by the Microsoft Semantic Kernel library.

Description

Kernel.InvokePromptStreamingAsync is an extension method that sends a prompt to the kernel's registered AI service and returns an IAsyncEnumerable<StreamingKernelContent> rather than a single complete result. Each element in the stream represents a chunk of the AI's response (typically one or a few tokens), allowing the caller to process and display content incrementally as it is generated.

The method follows the same execution pipeline as InvokePromptAsync (template rendering, service resolution, filter execution) but uses the streaming variant of the underlying AI service API. The returned async enumerable can be consumed with await foreach, which naturally handles the asynchronous arrival of each chunk. Cancellation is supported through both the CancellationToken parameter and the standard break statement within the loop.

Usage

Use InvokePromptStreamingAsync when building chat interfaces, real-time content displays, or any scenario where the user should see the AI response as it is being generated. This provides a significantly better user experience for long responses compared to waiting for the complete result.

Code Reference

Source Location

  • Repository: semantic-kernel
  • File: dotnet/src/SemanticKernel.Core/KernelExtensions.cs:L1461-1474

Signature

public static IAsyncEnumerable<StreamingKernelContent> InvokePromptStreamingAsync(
    this Kernel kernel,
    string promptTemplate,
    KernelArguments? arguments = null,
    string? templateFormat = null,
    IPromptTemplateFactory? promptTemplateFactory = null,
    CancellationToken cancellationToken = default)

Import

using Microsoft.SemanticKernel;

I/O Contract

Inputs

Name Type Required Description
kernel Kernel Yes The kernel instance (implicit via extension method).
promptTemplate string Yes The prompt string to send to the AI service. Supports Template:$variable template syntax.
arguments KernelArguments? No Optional arguments for template variable substitution and execution settings.
templateFormat string? No Optional template format identifier. Defaults to the Semantic Kernel format.
promptTemplateFactory IPromptTemplateFactory? No Optional factory for the template renderer.
cancellationToken CancellationToken No Optional cancellation token. Can also be passed via WithCancellation() on the async enumerable.

Outputs

Name Type Description
return IAsyncEnumerable<StreamingKernelContent> An asynchronous stream of content chunks. Each StreamingKernelContent element contains a portion of the AI response (typically one or more tokens) along with optional metadata. Consume with await foreach.

Usage Examples

Basic Streaming

using Microsoft.SemanticKernel;

Kernel kernel = Kernel.CreateBuilder()
    .AddOpenAIChatClient(
        modelId: TestConfiguration.OpenAI.ChatModelId,
        apiKey: TestConfiguration.OpenAI.ApiKey)
    .Build();

// Stream the response token by token
await foreach (var update in kernel.InvokePromptStreamingAsync("What color is the sky?"))
{
    Console.Write(update);
}
Console.WriteLine();

Streaming with Template Variables

using Microsoft.SemanticKernel;

KernelArguments arguments = new() { { "topic", "sea" } };

await foreach (var update in kernel.InvokePromptStreamingAsync(
    "What color is the {{$topic}}?", arguments))
{
    Console.Write(update);
}
Console.WriteLine();

Streaming with Cancellation

using Microsoft.SemanticKernel;

var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));

await foreach (var update in kernel.InvokePromptStreamingAsync(
    "Write a long essay about the history of the universe.",
    cancellationToken: cts.Token))
{
    Console.Write(update);
}

Streaming to a StringBuilder

using Microsoft.SemanticKernel;
using System.Text;

var sb = new StringBuilder();

await foreach (var update in kernel.InvokePromptStreamingAsync("Tell me a joke."))
{
    sb.Append(update);
    Console.Write(update);  // Display in real time
}

string fullResponse = sb.ToString();  // Also capture the complete response

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment