Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Protectai Llm guard Output URLReachability

From Leeroopedia
Revision as of 13:44, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Protectai_Llm_guard_Output_URLReachability.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains URL_Validation, Output_Quality
Last Updated 2026-02-14 12:00 GMT

Overview

URLReachability is an output scanner that validates whether URLs found in LLM responses are reachable via HTTP requests.

Description

The URLReachability output scanner is not a thin wrapper; it has its own standalone implementation. It extracts all URLs from the LLM output and attempts to verify that each one is reachable by making HTTP requests. The scanner sends requests to each extracted URL and checks the HTTP response status code against a list of success_status_codes (defaulting to 200, 201, and 202). The timeout parameter controls how long to wait for each URL to respond before considering it unreachable. The is_reachable method provides a convenient way to test individual URLs. If any URL in the output is found to be unreachable, the output is flagged as invalid. Note that the source file has a typo in its name (url_reachabitlity.py), but the class is correctly named URLReachability.

Usage

Use this scanner when your LLM generates responses containing URLs that users might click on. This ensures that recommended links, references, and resources actually exist and are accessible. This is particularly important for documentation bots, research assistants, and customer support tools that provide links to resources, knowledge base articles, or product pages.

Code Reference

Source Location

Signature

class URLReachability(Scanner):
    def __init__(
        self,
        *,
        success_status_codes: list[int] | None = None,
        timeout: int = 5,
    ) -> None: ...

    def scan(self, prompt: str, output: str) -> tuple[str, bool, float]: ...

    def is_reachable(self, url: str) -> bool: ...

Import

from llm_guard.output_scanners import URLReachability

I/O Contract

Inputs

Name Type Required Description
prompt str Yes The input prompt
output str Yes The LLM output to scan for unreachable URLs

Constructor Parameters

Name Type Required Default Description
success_status_codes None No None HTTP status codes considered successful (defaults to [200, 201, 202])
timeout int No 5 Timeout in seconds for each HTTP request

Outputs

Name Type Description
sanitized_output str The output (unmodified)
is_valid bool Whether all URLs in the output are reachable
risk_score float Risk score (-1.0 to 1.0)

Usage Examples

Basic Usage

from llm_guard.output_scanners import URLReachability

scanner = URLReachability(timeout=5)

prompt = "Give me some useful links"
output = "Check out https://www.example.com and https://www.python.org for more information."

sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)

if is_valid:
    print("All URLs are reachable")
else:
    print(f"Some URLs are unreachable (risk: {risk_score})")

Custom Status Codes

from llm_guard.output_scanners import URLReachability

# Accept redirects as valid
scanner = URLReachability(
    success_status_codes=[200, 201, 202, 301, 302],
    timeout=10,
)

prompt = "Where can I find the documentation?"
output = "The documentation is at https://docs.example.com/latest"

sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
print(f"URLs reachable: {is_valid}")

Individual URL Check

from llm_guard.output_scanners import URLReachability

scanner = URLReachability()

# Check a single URL
url = "https://www.example.com"
reachable = scanner.is_reachable(url)
print(f"{url} is {'reachable' if reachable else 'unreachable'}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment