Implementation:PrefectHQ Prefect Fetch HTML Task

Metadata
Sources	Prefect
Domains	Web_Scraping
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete task for downloading HTML content from URLs using requests with Prefect retries.

Description

The fetch_html task wraps a requests.get call with @task(retries=3, retry_delay_seconds=2) for resilient HTML downloading. It fetches the raw HTML text from a given URL with a 10-second timeout.

Code Reference

Repository: https://github.com/PrefectHQ/prefect
File: examples/simple_web_scraper.py (L43-52)
Signature:

@task(retries=3, retry_delay_seconds=2)
def fetch_html(url: str) -> str:
    """Download page HTML (with retries)."""
    print(f"Fetching {url} …")
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return response.text

Import: from prefect import task; import requests

I/O Contract

Inputs

url (str, required) — URL to download

Outputs

str — Raw HTML text of the page

Usage Example

from prefect import flow, task
import requests

@task(retries=3, retry_delay_seconds=2)
def fetch_html(url: str) -> str:
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return response.text

@flow(log_prints=True)
def scrape(urls: list[str] | None = None) -> None:
    if urls:
        for url in urls:
            html = fetch_html(url)
            content = parse_article(html)
            print(content if content else "No article content found.")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment