Workflow:Puppeteer Puppeteer PDF Generation
| Knowledge Sources | |
|---|---|
| Domains | Browser_Automation, Document_Generation, Content_Export |
| Last Updated | 2026-02-11 23:30 GMT |
Overview
End-to-end process for generating PDF documents from web pages using Puppeteer's headless Chrome rendering engine.
Description
This workflow covers the standard procedure for converting web pages into PDF documents. It leverages Chrome's built-in print-to-PDF capability, accessible through the DevTools Protocol or WebDriver BiDi. The process involves launching a headless browser, navigating to the target page with appropriate wait conditions to ensure full content loading, configuring PDF output parameters (page size, margins, headers/footers, print backgrounds), and generating the final PDF file. This workflow supports both CDP and WebDriver BiDi protocols.
Usage
Execute this workflow when you need to generate PDF documents from rendered web content. Common use cases include creating printable reports from web dashboards, archiving web pages as PDF documents, generating invoices or receipts from web templates, and converting HTML content to PDF for distribution.
Execution Steps
Step 1: Launch Browser
Start a headless browser instance using Puppeteer. For PDF generation, headless mode is required since the print-to-PDF functionality depends on the headless rendering pipeline. Optionally specify the WebDriver BiDi protocol for cross-browser PDF generation.
Key considerations:
- PDF generation requires headless mode (the default)
- WebDriver BiDi protocol can be used for Firefox PDF generation
- Specify protocol: 'webDriverBiDi' in launch options for BiDi mode
Step 2: Create New Page
Open a new page (tab) in the browser. This creates the rendering context where the target content will be loaded and converted to PDF.
Key considerations:
- The page viewport does not affect PDF output dimensions
- PDF dimensions are controlled separately by the pdf() options
Navigate the page to the target URL and wait for the content to fully load. For PDF generation, it is critical to wait for all content (images, fonts, dynamic data) to be rendered before generating the PDF. The networkidle2 wait condition is recommended as it waits until there are no more than 2 network connections for 500ms.
Key considerations:
- Use waitUntil: 'networkidle2' to ensure dynamic content is loaded
- For single-page applications, networkidle0 may be more appropriate
- Custom wait logic (waitForSelector, waitForFunction) can be added for complex pages
Step 4: Configure And Generate PDF
Call the page's pdf() method with the desired output configuration. Options include paper format (letter, A4, legal, etc.), page margins, header and footer templates, whether to print CSS backgrounds, page ranges, and scale factor. The output can be saved to a file path or returned as a buffer.
Key considerations:
- Standard paper formats: letter, legal, tabloid, ledger, A0-A6
- Custom dimensions can be specified with width and height in CSS units
- Header and footer templates support HTML with special CSS classes for page number, date, title
- printBackground: true is needed to include CSS background colors and images
- The path option saves directly to disk; omitting it returns a Buffer
Step 5: Close Browser
Terminate the browser instance to release resources. Ensure the PDF file has been fully written before closing.
Key considerations:
- The pdf() call is asynchronous and must be awaited before closing
- Use try/finally for reliable cleanup