Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Liu00222 Open Prompt Injection Conditional Probability Computation

From Leeroopedia
Knowledge Sources
Domains NLP, Language_Modeling, Probability
Last Updated 2026-02-14 15:00 GMT

Overview

A technique for computing the conditional log-probability of a target text sequence given a conditioning prefix using an autoregressive language model.

Description

Conditional Probability Computation extracts the log-probability that a language model assigns to generating a target text sequence given a conditioning prefix. This is the fundamental building block of causal influence analysis: by comparing the conditional probability of a suffix given different prefixes, we can determine whether an intervening segment is natural continuation or injection. The computation uses teacher forcing (feeding the actual tokens and extracting their predicted probabilities) rather than free generation.

Usage

Use this principle as a utility within causal influence analysis. It is called twice for each influence score computation: once with just the clean prefix and once with the prefix plus the suspected injection, both conditioning on the same suffix.

Theoretical Basis

For an autoregressive model with vocabulary V:

logP(target|condition)=t=1TlogP(wt|w<t,condition)

The average log-probability normalizes by sequence length:

logP=1Tt=1TlogP(wt|w<t,condition)

In practice, the function concatenates condition and target, runs a forward pass, and extracts log-probabilities only for the target token positions using `log_softmax` over the logits.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment