Implementation:Turboderp org Exllamav2 ExLlamaV2SelectFilter
| Knowledge Sources | |
|---|---|
| Domains | Filtering, Constrained_Generation |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Token filter that constrains generation to produce exactly one of a predefined set of string options, with optional case-insensitive matching, using trie-based token traversal.
Description
ExLlamaV2SelectFilter is a subclass of ExLlamaV2Filter that forces the model to output one of the specified options verbatim. At each generation step, it computes which tokens could still lead to a valid option and which tokens would complete an option, enabling multiple-choice style constrained generation.
Key components:
- __init__(model, tokenizer, options, case_insensitive=False) -- Accepts a list of allowed output strings and an optional case_insensitive flag. If case-insensitive, all options are lowercased at init time.
- clone(c=None) -- Creates a copy preserving options, offset, prefix, case_insensitive, and sequence_str_cmp state.
- begin(prefix_str) -- Resets sequence_str, sequence_str_cmp, offset to 0, and stores the prefix (which is prepended to each option during matching).
- feed(token) -- Decodes the token to its string piece, appends it to sequence_str, and updates sequence_str_cmp (the comparison string). In case-insensitive mode, only the portion beyond the prefix is lowercased, preserving exact prefix matching.
- next() -- For each option, concatenates the prefix with the option string and checks whether the already-generated text (sequence_str_cmp) matches the option prefix up to the current offset. For matching options, traverses the tokenizer's character trie (case-insensitive trie if applicable) to find:
- pass_tokens -- Token IDs that advance along a valid option path.
- end_tokens -- Token IDs that would exactly complete a remaining option.
Returns (pass_tokens, end_tokens).
The case-insensitive mode has a special path: when the prefix is cased but the continuation should be case-insensitive, the filter individually verifies each leaf token against the cased option prefix to avoid false matches.
Usage
Use ExLlamaV2SelectFilter when the model must output exactly one of a known set of choices, such as sentiment labels ("positive", "negative", "neutral"), yes/no answers, or classification categories. It guarantees the output will be one of the specified strings.
Code Reference
Source Location
- Repository: Turboderp_org_Exllamav2
- File: exllamav2/generator/filters/select.py
- Lines: L1-130 (131 with trailing newline)
Signature
class ExLlamaV2SelectFilter(ExLlamaV2Filter):
options: list[str]
offset: int
prefix: str
case_insensitive: bool
sequence_str_cmp: str
def __init__(self,
model: ExLlamaV2,
tokenizer: ExLlamaV2Tokenizer,
options: list[str],
case_insensitive: bool = False):
...
def clone(self, c=None) -> ExLlamaV2SelectFilter:
...
def begin(self, prefix_str: str = "") -> None:
...
def feed(self, token: int) -> None:
...
def next(self) -> tuple[set[int], set[int]]:
...
Import
from exllamav2.generator.filters import ExLlamaV2SelectFilter
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | ExLlamaV2 | Yes | The loaded ExLlamaV2 model instance |
| tokenizer | ExLlamaV2Tokenizer | Yes | The tokenizer associated with the model |
| options | list[str] | Yes | List of allowed output strings the model may generate |
| case_insensitive | bool | No (default False) | If True, match options regardless of letter case |
| prefix_str | str | No (begin, default "") | Prefix string prepended to each option during matching |
| token | int | Yes (feed) | Token ID selected by the sampler |
Outputs
| Name | Type | Description |
|---|---|---|
| pass_tokens | set[int] | From next(): set of token IDs that advance toward one or more valid options |
| end_tokens | set[int] | From next(): set of token IDs that would exactly complete a valid option (triggers EOS) |
Usage Examples
Sentiment Classification
from exllamav2.generator.filters import ExLlamaV2SelectFilter
from exllamav2.generator import ExLlamaV2DynamicJob
# Constrain output to one of three sentiment labels
select_filter = ExLlamaV2SelectFilter(
model, tokenizer,
options=["positive", "negative", "neutral"]
)
job = ExLlamaV2DynamicJob(
input_ids=input_ids,
gen_settings=gen_settings,
max_new_tokens=10,
filters=[select_filter],
)
generator.enqueue(job)
Case-Insensitive Yes/No
from exllamav2.generator.filters import ExLlamaV2SelectFilter
# Accept "yes", "Yes", "YES", etc.
yn_filter = ExLlamaV2SelectFilter(
model, tokenizer,
options=["yes", "no"],
case_insensitive=True
)