Implementation:Protectai Llm guard TokenLimit

Knowledge Sources	LLM Guard LLM Guard Documentation
Domains	NLP, Input_Validation, Resource_Management
Last Updated	2026-02-14 12:00 GMT

Overview

Concrete tool for enforcing token limits on input prompts using tiktoken encoding, provided by the LLM Guard library.

Description

The TokenLimit class is an input scanner that uses the tiktoken library to count tokens and enforce a maximum limit. Prompts exceeding the limit are truncated to the first chunk. It supports both encoding-based (e.g., cl100k_base) and model-based (e.g., gpt-4) tokenization.

Usage

Import this scanner to prevent oversized prompts from reaching the LLM. Place it early in the scanner pipeline to avoid unnecessary processing of prompts that will be rejected.

Code Reference

Source Location

Repository: llm-guard
File: llm_guard/input_scanners/token_limit.py
Lines: L15-80

Signature

class TokenLimit(Scanner):
    def __init__(
        self,
        *,
        limit: int = 4096,
        encoding_name: str = "cl100k_base",
        model_name: str | None = None,
    ) -> None:
        """
        Args:
            limit: Maximum number of tokens allowed. Default: 4096.
            encoding_name: tiktoken encoding model. Default: "cl100k_base".
            model_name: Specific model for tiktoken encoding. Default: None.
        """

    def scan(self, prompt: str) -> tuple[str, bool, float]:
        """
        Check token count and truncate if over limit.

        Returns:
            - Original prompt (if within limit) or truncated first chunk
            - True if within limit, False if truncated
            - -1.0 if within limit, 1.0 if truncated
        """

Import

from llm_guard.input_scanners import TokenLimit

I/O Contract

Inputs

Name	Type	Required	Description
limit	int	No	Maximum token count (default: 4096)
encoding_name	str	No	tiktoken encoding (default: "cl100k_base")
model_name	str	No	Model-specific encoding (default: None)

Outputs

Name	Type	Description
prompt	str	Original or truncated prompt
is_valid	bool	True if within limit, False if truncated
risk_score	float	-1.0 if within limit, 1.0 if exceeded

Usage Examples

Basic Token Limit

from llm_guard.input_scanners import TokenLimit

scanner = TokenLimit(limit=4096)

# Short prompt passes
_, is_valid, _ = scanner.scan("Hello, how are you?")
# is_valid: True

# Long prompt gets truncated
long_prompt = "word " * 10000
truncated, is_valid, score = scanner.scan(long_prompt)
# is_valid: False, score: 1.0

Model-Specific Encoding

from llm_guard.input_scanners import TokenLimit

scanner = TokenLimit(limit=2048, model_name="gpt-4")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment