Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Microsoft Semantic kernel Conversation Thread Management

From Leeroopedia

Overview

Conversation Thread Management is the principle of maintaining stateful conversation context across multiple agent interactions in Microsoft Semantic Kernel. An AgentThread represents the accumulated history of messages exchanged between a user and one or more agents, enabling coherent multi-turn conversations where the agent remembers prior context.

This principle belongs to Workflow 3: Agent Conversation and Orchestration and addresses the fundamental challenge of maintaining conversational state in an otherwise stateless request-response architecture.

Description

The Statefulness Problem

Language models are inherently stateless -- each API call to a chat completion service is independent, with no memory of previous interactions. To create the illusion of a continuous conversation, the entire history of prior messages must be sent with each new request. This creates several challenges:

  • Context accumulation -- As conversations grow, the message history must be carefully managed to stay within token limits.
  • Multi-turn coherence -- The agent must see all prior messages to maintain context, avoid repeating itself, and build on previous responses.
  • Cross-invocation state -- When an agent is invoked multiple times, the thread must persist between calls and be passed back on each subsequent invocation.

Threads as Conversation State

Semantic Kernel addresses these challenges through the AgentThread abstraction:

  • An AgentThread is a container for the ordered sequence of messages that constitute a conversation.
  • The ChatHistoryAgentThread implementation stores messages as a ChatHistory object -- a list of ChatMessageContent items, each with an AuthorRole (User, Assistant, System, Tool) and content.
  • Threads are external to the agent -- the agent does not hold conversation state internally. Instead, the thread is passed into each invocation, and the updated thread is returned out with the response.

Thread Lifecycle

A conversation thread follows this lifecycle:

  1. Creation -- A thread is created either explicitly by the developer or implicitly by the first InvokeAsync call (when no thread is provided).
  2. Growth -- With each invocation, the user's message and the agent's response are appended to the thread.
  3. Reuse -- The thread is captured from each response and passed into the next invocation to maintain continuity.
  4. Disposal -- When the conversation is complete, the thread can be discarded or persisted for future reference.

Stateless vs. Stateful Invocation

Semantic Kernel supports both modes:

  • Stateless (no thread) -- Each invocation is independent. The agent sees only the current message and its system instructions. Suitable for one-shot queries.
  • Stateful (with thread) -- The agent sees the full conversation history. Each response returns the updated thread, which the caller captures for the next invocation.

The choice between stateless and stateful invocation is made at the call site, not at the agent definition, giving developers full flexibility.

Pre-Seeded Threads

Developers can create threads with pre-existing messages, enabling scenarios such as:

  • Context injection -- Providing relevant background information as prior messages.
  • Conversation resumption -- Restoring a conversation from a database or file.
  • Few-shot prompting -- Including example user-assistant exchanges that guide the agent's behavior.

Usage

Conversation Thread Management is used whenever:

  • An application requires multi-turn conversations with an agent.
  • Prior context must be available to the agent for coherent responses.
  • Multiple agents need to share conversation context within an orchestration.
  • Conversations need to be persisted, restored, or branched.

Stateless Invocation (No Thread)

ChatMessageContent message = new(AuthorRole.User, "Fortune favors the bold.");
await foreach (AgentResponseItem<ChatMessageContent> response in agent.InvokeAsync(message))
{
    Console.WriteLine(response.Message.Content);
}

Stateful Invocation (With Thread Capture)

AgentThread? thread = null;

// First message
ChatMessageContent message1 = new(AuthorRole.User, "Tell me a joke.");
await foreach (var response in agent.InvokeAsync(message1, thread))
{
    thread = response.Thread; // Capture thread
    Console.WriteLine(response.Message.Content);
}

// Second message — agent remembers the first exchange
ChatMessageContent message2 = new(AuthorRole.User, "Now tell me another one.");
await foreach (var response in agent.InvokeAsync(message2, thread))
{
    thread = response.Thread; // Update thread
    Console.WriteLine(response.Message.Content);
}

Pre-Seeded Thread

AgentThread thread = new ChatHistoryAgentThread([
    new ChatMessageContent(AuthorRole.User, "Tell me a joke."),
    new ChatMessageContent(AuthorRole.Assistant, "Why did the chicken cross the road? To get to the other side!"),
]);

// Agent sees the prior exchange and can build on it
ChatMessageContent message = new(AuthorRole.User, "That was terrible. Try harder.");
await foreach (var response in agent.InvokeAsync(message, thread))
{
    thread = response.Thread;
    Console.WriteLine(response.Message.Content);
}

Theoretical Basis

Conversation Thread Management is grounded in the concept of dialogue state tracking from conversational AI research. In task-oriented dialogue systems, maintaining an accurate representation of the conversation state is essential for the system to make appropriate responses and take correct actions.

The pattern also draws from the Memento Pattern in software design, where the internal state of an object (the conversation) is externalized so that it can be saved, restored, and passed between components without violating encapsulation.

The separation of thread state from agent state reflects the stateless service pattern common in distributed systems, where services remain stateless for scalability while state is managed externally (in databases, caches, or -- in this case -- thread objects passed by the caller).

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment