Workflow:Truera Trulens RAG With Guardrails
| Knowledge Sources | |
|---|---|
| Domains | LLM_Ops, Safety, RAG |
| Last Updated | 2026-02-14 08:00 GMT |
Overview
End-to-end process for adding runtime guardrails to a RAG application using TruLens feedback-driven context filtering, input blocking, and output blocking decorators.
Description
This workflow covers how to add production safety mechanisms to a RAG pipeline using TruLens guardrails. Guardrails evaluate feedback functions at runtime (synchronously, not deferred) to filter low-relevance contexts before they reach the LLM, block unsafe or off-topic user inputs, and block harmful or low-quality outputs. Unlike standard feedback evaluation which runs asynchronously after execution, guardrails run inline and actively modify application behavior based on evaluation scores.
Usage
Execute this workflow when you need to deploy a RAG application with runtime quality and safety controls. This includes scenarios where you want to prevent hallucination by filtering irrelevant retrieved contexts, block prompt injection or off-topic queries at the input stage, or prevent toxic or harmful responses from reaching the user. Guardrails are complementary to standard deferred evaluation and are designed for production deployments.
Execution Steps
Step 1: Initialize TruLens Session and Provider
Create a TruSession and configure a feedback provider. The provider will power the guardrail evaluations, so choose a model that balances quality with latency since guardrails run synchronously at request time.
Key considerations:
- Guardrail evaluations add latency to each request (they run synchronously)
- Choose a fast provider model to minimize impact on response time
- The session must be initialized before guardrail decorators are used
Step 2: Define Guardrail Feedback Functions
Create feedback functions specifically for guardrail use. Guardrail feedback functions must return only a score (not reasons), as the score is compared against a threshold to make a pass/fail decision.
Key considerations:
- Use the non-CoT variants (e.g., context_relevance instead of context_relevance_with_cot_reasons)
- Each guardrail function is paired with a threshold that determines the pass/fail boundary
- Multiple guardrails can be applied to different methods independently
Step 3: Apply Context Filter Guardrail
Decorate the retrieval method with @context_filter to automatically evaluate each retrieved context chunk against the query. Chunks scoring below the threshold are removed before being passed to the generation step, reducing hallucination from irrelevant contexts.
What happens:
- Each retrieved context chunk is evaluated individually against the query
- Chunks with scores below the threshold are filtered out
- The filtered list is returned to the caller
- If all chunks are filtered, an empty list is returned
Key considerations:
- Set keyword_for_prompt to the parameter name containing the query
- The threshold determines how aggressively contexts are filtered (higher = stricter)
- The retrieval method must return a list of strings
Step 4: Apply Input Blocking Guardrail
Decorate the application entry point with @block_input to evaluate incoming queries before processing. Queries that fail the safety check (score below threshold) are blocked and a default message is returned instead.
What happens:
- The feedback function evaluates the input query
- If the score is below the threshold, execution is blocked
- A configurable response is returned to the caller
- If the score is above the threshold, execution proceeds normally
Key considerations:
- Use safety-oriented feedback functions (toxicity, off-topic detection)
- Set keyword_for_prompt to the parameter containing the user input
- The return_value parameter controls what is returned when input is blocked
Step 5: Apply Output Blocking Guardrail
Decorate the generation method with @block_output to evaluate the response after generation but before returning to the user. Responses that fail the quality or safety check are replaced with a safe default.
What happens:
- The method executes normally and produces output
- The output is evaluated by the feedback function
- If the score is below the threshold, the output is replaced with a safe default
- If the score is above the threshold, the original output is returned
Key considerations:
- Use quality or safety feedback functions (toxicity, coherence)
- Set keyword_for_response to "return" to evaluate the method's return value
- Consider the additional latency from post-generation evaluation
Step 6: Wrap and Record With Evaluation
Wrap the guardrail-protected application with the appropriate TruLens wrapper (TruChain, TruApp, etc.) and add deferred feedback functions for comprehensive evaluation. This combines runtime guardrails with asynchronous evaluation for both safety and quality monitoring.
Key considerations:
- Guardrails and deferred evaluation are complementary
- Guardrails handle runtime safety; deferred evaluation provides quality metrics
- The dashboard shows both guardrail actions and evaluation scores
Step 7: Monitor Guardrail Actions
Use the TruLens dashboard to monitor guardrail activation patterns. Track how often inputs are blocked, contexts are filtered, and outputs are replaced. This data helps tune thresholds and identify patterns in problematic queries or responses.
Key considerations:
- Review blocked inputs to check for false positives (legitimate queries blocked)
- Monitor filtered context rates to tune retrieval quality
- Adjust thresholds based on production data