Principle:Guardrails ai Guardrails Stream Invocation
| Knowledge Sources | |
|---|---|
| Domains | Streaming, LLM_Integration |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
An invocation principle for executing Guards in streaming mode where LLM output is validated incrementally as chunks arrive.
Description
Stream Invocation transforms a standard Guard call into a generator-based streaming pipeline. When stream=True is passed to Guard.__call__, the execution path switches from Runner to StreamRunner, which: (1) calls the LLM with streaming enabled, (2) accumulates chunks, (3) validates completed segments using the validator's chunking strategy, and (4) yields ValidationOutcome objects for each validated segment.
Key constraint: Re-asking is not supported in streaming mode (num_reasks must be 0), because the stream produces incremental results that cannot be "taken back."
Usage
Use this when you need real-time validation of LLM output as it streams, such as in chat interfaces where you want to display validated text progressively. Pass stream=True as a keyword argument to the Guard call.
Theoretical Basis
The streaming execution model:
# Pseudocode for stream invocation
stream = llm_call(messages, stream=True)
buffer = ""
for chunk in stream:
buffer += chunk.text
segments = chunking_function(buffer)
if segments:
validated = validate(segments[0])
buffer = segments[1] # remainder
yield ValidationOutcome(validated_output=validated)