Implementation:Huggingface Peft PromptTuningConfig

Overview

PromptTuningConfig is the configuration dataclass for prompt tuning in the Hugging Face PEFT library. It stores all parameters needed to set up a prompt tuning adapter, including the initialization strategy, optional initialization text, and tokenizer settings. This class inherits from PromptLearningConfig, which provides shared fields common to all prompt-learning methods (prompt tuning, prefix tuning, and P-tuning).

Signature

from peft import PromptTuningConfig, PromptTuningInit

@dataclass
class PromptTuningConfig(PromptLearningConfig):
    prompt_tuning_init: Union[PromptTuningInit, str] = PromptTuningInit.RANDOM
    prompt_tuning_init_text: Optional[str] = None
    tokenizer_name_or_path: Optional[str] = None
    tokenizer_kwargs: Optional[dict] = None

Parameters

Own Parameters

Parameter	Type	Default	Description
`prompt_tuning_init`	`Union[PromptTuningInit, str]`	`PromptTuningInit.RANDOM`	The initialization strategy for the prompt embedding. Accepted values are: TEXT (initialize from a provided text string), SAMPLE_VOCAB (initialize by randomly sampling from the model's vocabulary embeddings), RANDOM (initialize with random continuous soft tokens; note that these may fall outside the embedding manifold).
`prompt_tuning_init_text`	`Optional[str]`	`None`	The text string used to initialize the prompt embedding. Only used when `prompt_tuning_init` is `TEXT`. Must be provided when using text initialization.
`tokenizer_name_or_path`	`Optional[str]`	`None`	The name or path of the tokenizer used to tokenize the initialization text. Only used when `prompt_tuning_init` is `TEXT`. Must be provided when using text initialization.
`tokenizer_kwargs`	`Optional[dict]`	`None`	Additional keyword arguments passed to `AutoTokenizer.from_pretrained`. Only valid when `prompt_tuning_init` is `TEXT`.

Inherited Parameters (from PromptLearningConfig)

Parameter	Type	Default	Description
`num_virtual_tokens`	`int`	`None`	The number of virtual (soft) tokens to prepend to the input sequence.
`token_dim`	`int`	`None`	The hidden embedding dimension of the base transformer model.
`num_transformer_submodules`	`Optional[int]`	`None`	The number of transformer submodules in the base model.
`num_attention_heads`	`Optional[int]`	`None`	The number of attention heads in the base model.
`num_layers`	`Optional[int]`	`None`	The number of layers in the base transformer model.
`task_type`	`str`	(from PeftConfig)	The task type (e.g., SEQ_CLS, CAUSAL_LM, SEQ_2_SEQ_LM, TOKEN_CLS).

Validation Rules

The __post_init__ method enforces the following constraints:

When prompt_tuning_init is TEXT, the tokenizer_name_or_path must be provided.
When prompt_tuning_init is TEXT, the prompt_tuning_init_text must be provided.
The tokenizer_kwargs parameter is only valid when prompt_tuning_init is TEXT.

Usage Example

from peft import PromptTuningConfig, PromptTuningInit, get_peft_model, TaskType

# Random initialization
config_random = PromptTuningConfig(
    task_type=TaskType.CAUSAL_LM,
    num_virtual_tokens=20,
)

# Text-based initialization
config_text = PromptTuningConfig(
    task_type=TaskType.SEQ_CLS,
    num_virtual_tokens=20,
    prompt_tuning_init=PromptTuningInit.TEXT,
    prompt_tuning_init_text="Classify if the sentiment of this review is positive or negative:",
    tokenizer_name_or_path="bert-base-uncased",
)

# Vocabulary sampling initialization
config_vocab = PromptTuningConfig(
    task_type=TaskType.CAUSAL_LM,
    num_virtual_tokens=20,
    prompt_tuning_init=PromptTuningInit.SAMPLE_VOCAB,
)

model = get_peft_model(base_model, config_text)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment