Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:PrefectHQ Prefect Fetch HTML Task

From Leeroopedia


Metadata
Sources Prefect
Domains Web_Scraping
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete task for downloading HTML content from URLs using requests with Prefect retries.

Description

The fetch_html task wraps a requests.get call with @task(retries=3, retry_delay_seconds=2) for resilient HTML downloading. It fetches the raw HTML text from a given URL with a 10-second timeout.

Code Reference

@task(retries=3, retry_delay_seconds=2)
def fetch_html(url: str) -> str:
    """Download page HTML (with retries)."""
    print(f"Fetching {url} …")
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return response.text
  • Import: from prefect import task; import requests

I/O Contract

Inputs

  • url (str, required) — URL to download

Outputs

  • str — Raw HTML text of the page

Usage Example

from prefect import flow, task
import requests

@task(retries=3, retry_delay_seconds=2)
def fetch_html(url: str) -> str:
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return response.text

@flow(log_prints=True)
def scrape(urls: list[str] | None = None) -> None:
    if urls:
        for url in urls:
            html = fetch_html(url)
            content = parse_article(html)
            print(content if content else "No article content found.")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment