Principle:BerriAI Litellm Message Redaction

Knowledge Sources	Domains	Last Updated
[[1]]	Observability, Privacy, Compliance	2026-02-15

Overview

Message redaction is the practice of stripping sensitive content from LLM request and response payloads before they reach logging backends, ensuring privacy compliance without disabling observability.

Description

When LLM requests and responses are logged for observability, they often contain sensitive data: personal information in user prompts, proprietary content in system messages, or confidential data in model outputs. Many organizations face regulatory requirements (GDPR, HIPAA, SOC 2) that prohibit storing such content in third-party logging platforms.

Message redaction solves this by providing a privacy layer between the logging payload construction and the callback dispatch. It replaces message content with a placeholder string while preserving all non-content metadata (cost, latency, tokens, model, status) so that operational monitoring continues to function.

Key aspects of the pattern:

Selective redaction -- Only message/prompt input and response content are replaced. Token counts, cost, latency, model identifiers, and operational metadata remain intact.
Multi-level control -- Redaction can be activated globally (via a module-level flag), per-request (via dynamic parameters or HTTP headers), or per-logger (via the logger's turn_off_message_logging constructor flag).
Priority resolution -- When multiple controls conflict, a well-defined priority chain resolves the outcome: dynamic parameters override headers, which override global settings.
Deep copy safety -- Redaction operates on deep copies of response objects to avoid corrupting the actual response returned to the caller.

Usage

Apply message redaction when:

Regulatory requirements prohibit logging of user prompts or model responses.
Operating in a multi-tenant environment where some teams require message logging and others do not.
Sending telemetry to third-party platforms where storing conversation content is unacceptable.
Building a custom logger that should never see raw message content.

Theoretical Basis

Redaction Strategy

The redaction strategy follows the Decorator pattern -- it wraps the logging dispatch to transparently modify the payload before it reaches consumers.

function redact_if_needed(model_call_details, result):
    if should_redact(model_call_details):
        return perform_redaction(model_call_details, result)
    else:
        return result  -- pass through unchanged

Priority Resolution (Pseudocode)

function should_redact(model_call_details):
    dynamic_param = get_dynamic_param("turn_off_message_logging")

    headers = get_request_headers(model_call_details)

    # Priority 1: Explicit disable via header
    if headers["litellm-disable-message-redaction"]:
        return false

    header_enables_redaction = (
        headers["litellm-enable-message-redaction"] or
        headers["x-litellm-enable-message-redaction"]
    )

    # Priority 2: Dynamic parameter (per-request)
    if dynamic_param is not None:
        return dynamic_param  -- boolean

    # Priority 3: Header explicitly enables redaction
    if header_enables_redaction:
        return true

    # Priority 4: Global setting
    return global_turn_off_message_logging

What Gets Redacted

INPUT REDACTION:
    model_call_details["messages"] = [{"role": "user", "content": "redacted-by-litellm"}]
    model_call_details["prompt"] = ""
    model_call_details["input"] = ""

OUTPUT REDACTION (ModelResponse):
    for each choice:
        choice.message.content = "redacted-by-litellm"
        choice.message.reasoning_content = "redacted-by-litellm"
        choice.message.thinking_blocks = None

OUTPUT REDACTION (EmbeddingResponse):
    response.data = []  -- embedding vectors removed entirely

OUTPUT REDACTION (other types):
    result = {"text": "redacted-by-litellm"}

Per-Logger Redaction

Independent of the global/request-level redaction, each CustomLogger instance can have its own message_logging flag. If set to False, the framework calls a separate redaction path specifically for that logger instance, even if global redaction is not enabled. This allows selective privacy control per integration.

Streaming Considerations

For streaming responses, the complete streaming response (assembled from chunks) is also redacted in the model_call_details dictionary. This ensures that the aggregated response stored in complete_streaming_response does not leak content when redaction is active.

Related Pages

Implementation:BerriAI_Litellm_Redact_Messages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment