Running Puppeteer or Playwright for PDF generation in production comes with operational baggage: Chromium version management, memory consumption, concurrency handling, and more — none of which are related to actually generating PDFs.
This guide covers how to migrate from Puppeteer or Playwright to a managed API, with concrete before/after code in Node.js, Python, and PHP.
The Puppeteer / Playwright Operations Tax
Common problems when running Puppeteer or Playwright in production:
Memory Consumption
Chromium uses 200–500MB per process. Generating multiple PDFs concurrently can spike memory usage and cause OOM (Out of Memory) crashes.
Chromium Management
Puppeteer and Playwright version upgrades change the bundled Chromium version. CI/CD pipelines break when Chromium downloads fail, or when OS compatibility issues arise.
Concurrency Implementation
Calling puppeteer.launch() or playwright.chromium.launch() per request is slow. Building a browser pool is complex — you need to handle tab leaks, zombie processes, and graceful shutdowns.
Cold Starts
In serverless environments (AWS Lambda, etc.), Chromium startup takes seconds, making it hard to meet latency requirements. A managed API eliminates cold starts entirely. For serverless-specific setup, see the serverless PDF API guide.
For technical background on these issues, see wkhtmltopdf vs Chromium.
Before / After: Code Comparison
Before: Puppeteer (Self-managed)
const puppeteer = require('puppeteer');
// Browser pool management required
let browser;
async function getBrowser() {
if (!browser || !browser.isConnected()) {
browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage', // Memory workaround
'--disable-gpu',
],
});
}
return browser;
}
async function generatePdf(html) {
const browser = await getBrowser();
const page = await browser.newPage();
try {
await page.setContent(html, { waitUntil: 'networkidle0' });
const pdf = await page.pdf({
format: 'A4',
margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
printBackground: true,
});
return pdf;
} finally {
await page.close(); // Prevent tab leaks
}
}
// You also need:
// - Browser pool management
// - Timeout handling
// - Zombie process detection and restart
// - Memory monitoring
// - Chromium version management
After: Managed API
async function generatePdf(html) {
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
method: 'POST',
headers: {
'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({
html,
options: {
format: 'A4',
margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
print_background: true,
},
}),
});
if (!response.ok) throw new Error(`API error: ${response.status}`);
return Buffer.from(await response.arrayBuffer());
}
// All you need: an API key
// Browser management, memory, concurrency — all handled by the API
Code is dramatically simpler. All Chromium-related operational code is eliminated.
Migrating from Playwright
If you're using Playwright instead of Puppeteer, the migration path is identical. Playwright offers more features than Puppeteer, but for PDF generation the operational burden is just as high.
Before: Playwright (Node.js)
const { chromium } = require('playwright');
// Playwright also requires browser lifecycle management
let browser;
async function getBrowser() {
if (!browser) {
browser = await chromium.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox'],
});
}
return browser;
}
async function generatePdfWithPlaywright(html) {
const browser = await getBrowser();
const context = await browser.newContext();
const page = await context.newPage();
try {
await page.setContent(html, { waitUntil: 'networkidle' });
const pdf = await page.pdf({
format: 'A4',
margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
printBackground: true,
});
return pdf;
} finally {
await context.close();
}
}
// Installation alone pulls hundreds of MB:
// npx playwright install chromium
Before: Playwright (Python)
import asyncio
from playwright.async_api import async_playwright
async def generate_pdf_playwright(html: str) -> bytes:
async with async_playwright() as p:
# Chromium launch takes 1–3 seconds
browser = await p.chromium.launch()
context = await browser.new_context()
page = await context.new_page()
await page.set_content(html, wait_until="networkidle")
pdf = await page.pdf(
format="A4",
margin={"top": "20mm", "bottom": "20mm",
"left": "15mm", "right": "15mm"},
print_background=True,
)
await browser.close()
return pdf
# pip install playwright
# playwright install chromium # Downloads hundreds of MB
After: Managed API (Same code whether you're migrating from Puppeteer or Playwright)
The API call is identical regardless of which browser automation library you're coming from.
# After: FUNBREW PDF API (works for both Puppeteer and Playwright migrations)
import os
import requests
def generate_pdf(html: str) -> bytes:
resp = requests.post(
"https://pdf.funbrew.cloud/api/v1/pdf/generate",
headers={
"X-API-Key": os.environ["FUNBREW_PDF_API_KEY"],
"Content-Type": "application/json",
},
json={
"html": html,
"options": {
"format": "A4",
"margin": {"top": "20mm", "bottom": "20mm",
"left": "15mm", "right": "15mm"},
"print_background": True,
},
},
timeout=30,
)
resp.raise_for_status()
return resp.content
The waitUntil: 'networkidle' behaviour Playwright uses is handled automatically on the API server side — no equivalent parameter needed.
Cost Breakdown: Self-Hosted vs Managed API
"Won't using an API increase costs?" is a fair question. Let's work through the numbers.
Self-Hosted Costs
Running Puppeteer or Playwright in production involves multiple cost buckets.
Server costs
Chromium consumes 200–500MB per process. Handling 1,000 PDF jobs per month with any concurrency requires at least 2GB RAM.
Example: AWS t3.medium (2 vCPU / 4GB RAM)
- On-demand: ~$33/month
- Additional instances at peak: variable
- Data transfer: variable
Engineering and maintenance costs
Initial implementation (browser pool + timeouts + retries): 8–16 hours
Puppeteer/Playwright/Chromium version bumps: 2–4 hours per update (3–4× per year)
Incident response (OOM, zombie processes): 2–8 hours per incident (0–2× per month)
Estimated annual maintenance: 20–50 engineer-hours
CI/CD costs
Chromium downloads slow down your build pipeline. Docker images balloon to 1GB+, increasing push/pull times and registry storage.
Managed API Costs
FUNBREW PDF free tier: 30 PDFs/month at no cost
Paid plans: usage-based pricing
Maintenance cost: near zero (only handle API client changes)
Where the Break-Even Is
Low volume (< 100 PDFs/month):
Self-hosted: $10–30/month server + engineer time
API: free or very low cost
→ API wins clearly
Medium volume (100–10,000 PDFs/month):
Self-hosted: $30–100/month server + maintenance hours
API: usage-based pricing
→ When maintenance hours are factored in, API is usually cheaper
High volume (10,000+ PDFs/month):
Self-hosted: dedicated infrastructure, dedicated ops time
API: scales on the API side
→ Depends on your specifics, but hard to beat a team that does only this
The often-missed cost: engineer opportunity cost. Every hour spent on Chromium management is an hour not spent on your product.
Python Migration Examples (Expanded)
Python: httpx (Async)
In addition to the synchronous requests approach, here's an async implementation using httpx, suitable for FastAPI, Django async views, or any async Python application.
# httpx async implementation (FastAPI, Django Channels, etc.)
import os
import httpx
API_URL = "https://pdf.funbrew.cloud/api/v1/pdf/generate"
API_KEY = os.environ["FUNBREW_PDF_API_KEY"]
async def generate_pdf_async(html: str) -> bytes:
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.post(
API_URL,
headers={
"X-API-Key": API_KEY,
"Content-Type": "application/json",
},
json={
"html": html,
"options": {
"format": "A4",
"margin": {
"top": "20mm", "bottom": "20mm",
"left": "15mm", "right": "15mm",
},
"print_background": True,
},
},
)
response.raise_for_status()
return response.content
# FastAPI usage example
from fastapi import FastAPI
from fastapi.responses import Response
app = FastAPI()
@app.post("/invoice/pdf")
async def create_invoice_pdf(html: str):
pdf_bytes = await generate_pdf_async(html)
return Response(
content=pdf_bytes,
media_type="application/pdf",
headers={"Content-Disposition": "attachment; filename=invoice.pdf"},
)
# Django view (synchronous)
import os
import requests
from django.http import HttpResponse
def generate_pdf_view(request):
html = render_invoice_html(request) # Render your template to HTML
resp = requests.post(
"https://pdf.funbrew.cloud/api/v1/pdf/generate",
headers={
"X-API-Key": os.environ["FUNBREW_PDF_API_KEY"],
"Content-Type": "application/json",
},
json={
"html": html,
"options": {"format": "A4", "print_background": True},
},
timeout=30,
)
resp.raise_for_status()
return HttpResponse(
resp.content,
content_type="application/pdf",
headers={"Content-Disposition": 'attachment; filename="invoice.pdf"'},
)
PHP Migration Examples (Expanded)
PHP: Guzzle
Beyond the curl example, here's a Guzzle-based implementation that fits cleanly into Laravel, Symfony, and other PHP frameworks.
<?php
// Guzzle implementation (recommended for Laravel / Symfony)
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
class PdfGenerator
{
private Client $client;
private string $apiKey;
public function __construct()
{
$this->client = new Client([
'base_uri' => 'https://pdf.funbrew.cloud',
'timeout' => 30.0,
]);
$this->apiKey = env('FUNBREW_PDF_API_KEY');
}
public function generate(string $html, array $options = []): string
{
try {
$response = $this->client->post('/api/v1/pdf/generate', [
'headers' => [
'X-API-Key' => $this->apiKey,
'Content-Type' => 'application/json',
],
'json' => [
'html' => $html,
'options' => array_merge([
'format' => 'A4',
'margin' => [
'top' => '20mm',
'bottom' => '20mm',
'left' => '15mm',
'right' => '15mm',
],
'print_background' => true,
], $options),
],
]);
return (string) $response->getBody();
} catch (RequestException $e) {
$statusCode = $e->getResponse()?->getStatusCode() ?? 0;
throw new \RuntimeException("PDF generation failed: HTTP {$statusCode}", $statusCode, $e);
}
}
}
<?php
// Laravel controller usage example
use App\Services\PdfGenerator;
use Illuminate\Http\Response;
class InvoiceController extends Controller
{
public function __construct(private PdfGenerator $pdfGenerator) {}
public function download(Invoice $invoice): Response
{
$html = view('invoices.pdf', compact('invoice'))->render();
$pdf = $this->pdfGenerator->generate($html);
return response($pdf, 200, [
'Content-Type' => 'application/pdf',
'Content-Disposition' => "attachment; filename=\"invoice-{$invoice->id}.pdf\"",
]);
}
}
PHP: Before / After (spatie/browsershot)
The most common PHP starting point is spatie/browsershot, which shells out to Node.js and Puppeteer under the hood.
<?php
// Before: spatie/browsershot (delegates to Node.js + Puppeteer internally)
use Spatie\Browsershot\Browsershot;
function generatePdf(string $html): string
{
return Browsershot::html($html)
->format('A4')
->margins(20, 15, 20, 15)
->showBackground()
->pdf();
// Requires Node.js + Puppeteer installed on the server.
// Spawns a Chromium process on every call.
}
<?php
// After: FUNBREW PDF API (PHP curl — no Node.js or Puppeteer needed)
function generatePdf(string $html): string
{
$apiKey = getenv('FUNBREW_PDF_API_KEY');
$ch = curl_init('https://pdf.funbrew.cloud/api/v1/pdf/generate');
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTPHEADER => [
'X-API-Key: ' . $apiKey,
'Content-Type: application/json',
],
CURLOPT_POSTFIELDS => json_encode([
'html' => $html,
'options' => [
'format' => 'A4',
'margin' => [
'top' => '20mm', 'bottom' => '20mm',
'left' => '15mm', 'right' => '15mm',
],
],
]),
]);
$body = curl_exec($ch);
$status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($status !== 200) {
throw new \RuntimeException("PDF API error: HTTP $status");
}
return $body; // PDF binary
}
// Dependency: curl only (default PHP extension). Node.js and Puppeteer removed.
Performance Comparison
Here are representative numbers comparing self-hosted Puppeteer/Playwright against a managed API. Actual results vary by environment.
Response Times
| Scenario | Puppeteer (Cold start) | Puppeteer (Warm) | Managed API |
|---|---|---|---|
| Simple HTML (~10KB) | 3,000–5,000ms | 500–800ms | 300–600ms |
| HTML with images (~100KB) | 4,000–8,000ms | 800–1,500ms | 500–1,200ms |
| Complex CSS + images | 6,000–15,000ms | 1,500–3,000ms | 800–2,000ms |
| Serverless (Lambda) | 8,000–15,000ms | — | 300–600ms |
"Cold start" includes Chromium launch time. Even a warm browser pool is typically slower than API responses, because the API infrastructure stays hot and scales horizontally.
Memory Usage
Puppeteer (browser pool, 3 processes):
Idle: 600MB–1.5GB
During PDF gen: 1GB–2GB+
Peak (10 concurrent): 3GB+
Managed API:
Idle: <10MB (HTTP client only)
During PDF gen: <50MB (handling HTTP response)
Peak (100 concurrent): similar — no scaling needed on your server
Docker Image Size
# Before: Image with Puppeteer
FROM node:20
RUN apt-get install -y chromium # +300MB
RUN npm install puppeteer # +200MB
# Result: 1GB+
# After: API client only
FROM node:20-alpine
# No chromium needed
# Result: ~150MB
Smaller images mean faster CI builds, faster deploys, and lower container registry storage costs.
Concurrency
Puppeteer (browser pool on a 2GB server):
Max concurrent: 3–5 jobs
Beyond that: queue or OOM
Managed API:
Max concurrent: effectively unlimited (plan-dependent rate limits)
Scaling: handled by the API side
Migration Steps
Step 1: Verify API Output
Paste your HTML into the Playground and compare the output with Puppeteer's. Use the free plan (30/month) to test actual API responses.
Step 2: Replace the PDF Generation Function
Swap the Puppeteer or Playwright call with an API call. For language-specific examples, see the quickstart guide.
Step 3: Migrate Templates
If you're building HTML with string concatenation or template literals, consider migrating to the template engine for better maintainability.
// Before: String concatenation
const html = `<h1>Invoice</h1><p>${customerName}</p>...`;
const pdf = await generatePdfWithPuppeteer(html);
// After: Template + API
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/from-template', {
method: 'POST',
headers: {
'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({
template: 'invoice',
variables: { customer_name: customerName, total: '1,650.00' },
}),
});
Step 4: Remove Puppeteer/Playwright Dependencies
Once the migration is complete, clean up.
# For Puppeteer
npm uninstall puppeteer
# For Playwright
npm uninstall playwright @playwright/test
# Remove chromium install lines from Dockerfile
# Remove Chromium cache config from CI/CD
Migration Checklist
Use this checklist to keep your migration on track.
Pre-migration
[ ] Paste HTML into the Playground and compare output quality
[ ] Sign up for the free plan (30/month) and obtain an API key
[ ] Set FUNBREW_PDF_API_KEY in your environment
[ ] Replace the PDF generation function with the API call
[ ] Visual diff the PDF output in a staging environment
CSS review
[ ] Audit @media print vs @media screen usage
[ ] Verify page-break-before/after behaviour
[ ] Convert external fonts (Google Fonts, etc.) to inline CSS or Base64 if needed
[ ] Confirm background colours and images render with print settings
Production cutover
[ ] Add error handling (retry on 429, alert on 5xx)
[ ] Set a 30-second client timeout
[ ] Run: npm uninstall puppeteer (or playwright)
[ ] Remove Chromium install lines from your Dockerfile
[ ] Remove Chromium cache steps from CI/CD
[ ] Revisit server memory allocation (Chromium is no longer needed)
Common Migration Issues and Fixes
Issue 1: External resources (images, fonts) not rendering
Cause: The API server cannot reach localhost or private-network URLs.
Fix: Embed images as Base64 data URIs, or serve them from a public URL.
const fs = require('fs');
const path = require('path');
// Convert a local image to a Base64 data URI
function embedLocalImage(imagePath) {
const ext = path.extname(imagePath).slice(1); // 'png', 'jpg', etc.
const data = fs.readFileSync(imagePath).toString('base64');
return `data:image/${ext};base64,${data}`;
}
const html = `
<img src="${embedLocalImage('./logo.png')}" alt="Logo" />
<p>Logo is embedded as Base64 — no external request needed.</p>
`;
Issue 2: Fonts not applied
Cause: External font requests (Google Fonts, etc.) time out or are blocked by network policy.
Fix: Embed the font as Base64 in your CSS. For Japanese text, FUNBREW PDF includes Japanese fonts pre-installed — no extra setup needed.
<style>
/* Option A: Google Fonts URL (works when the API can reach the internet) */
@import url('https://fonts.googleapis.com/css2?family=Inter&display=swap');
/* Option B: Base64-embedded font (always works) */
@font-face {
font-family: 'MyFont';
src: url('data:font/woff2;base64,d09GMgAB...') format('woff2');
}
body { font-family: 'Inter', sans-serif; }
</style>
Issue 3: @media screen styles not applied
Cause: If you called page.emulateMediaType('screen') in Puppeteer, note that the API defaults to the print media type.
Fix: Update your CSS to target @media print, or remove the media query so styles apply unconditionally.
/* Before: screen-only (ignored by the API's print media type) */
@media screen {
.invoice { padding: 40px; background: #fff; }
}
/* After: target print explicitly */
@media print {
.invoice { padding: 40px; background: #fff; }
}
/* Or drop the media query to apply in both contexts */
.invoice { padding: 40px; background: #fff; }
Issue 4: Timeout errors on complex pages
Cause: Large images or complex CSS take longer to render.
Fix: Increase the client timeout to at least 30 seconds and add exponential-backoff retry logic.
async function generatePdfWithRetry(html, options = {}, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 30_000); // 30 s
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
method: 'POST',
headers: {
'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({ html, options }),
signal: controller.signal,
});
clearTimeout(timeoutId);
if (response.status === 429 && attempt < maxRetries) {
// Rate-limited: back off and retry
await new Promise(r => setTimeout(r, 2 ** attempt * 1000));
continue;
}
if (!response.ok) throw new Error(`API error: ${response.status}`);
return Buffer.from(await response.arrayBuffer());
} catch (err) {
if (attempt === maxRetries) throw err;
await new Promise(r => setTimeout(r, 1000 * attempt));
}
}
}
Issue 5: Playwright waitForSelector equivalents
Cause: You used page.waitForSelector() to wait for dynamic content before generating the PDF.
Fix: The cleanest solution is to render the HTML completely on the server side before sending it to the API. The API does not run JavaScript from your application — it renders the HTML you provide.
// Before: Playwright waiting for dynamic content
await page.waitForSelector('.invoice-total');
const pdf = await page.pdf();
// After: Render complete HTML server-side, then send to API
const html = await renderInvoiceServerSide(invoiceData); // SSR produces complete HTML
const pdf = await generatePdf(html); // No waiting needed
Issue 6: page.evaluate() used to manipulate the DOM before export
Cause: page.evaluate() was used to inject values into the DOM just before calling page.pdf().
Fix: Move that logic into your HTML template. Generate the final HTML (with values substituted) before sending it to the API.
// Before: DOM manipulation via evaluate()
await page.evaluate((total) => {
document.querySelector('#total').textContent = total;
}, '$1,650.00');
const pdf = await page.pdf();
// After: Embed values in the HTML template
const html = `
<div id="total">${total}</div>
`;
const pdf = await generatePdf(html);
What Improves After Migration
| Aspect | Puppeteer/Playwright (Self-managed) | Managed API |
|---|---|---|
| Memory usage | 200–500MB per process | Near zero (HTTP calls only) |
| Cold start | 1–3 seconds | None (always running) |
| Concurrency | Manual implementation | Auto-scaling by API |
| Chromium management | Version pinning, compatibility | Not needed |
| Serverless support | Lambda layers, workarounds | Works out of the box |
| Docker image size | 1GB+ (with Chromium) | Lightweight (no Chromium) |
| Lambda cold start | 8–15 seconds | 300–600ms |
CSS Compatibility
FUNBREW PDF's quality engine is Chromium-based, so it has the same CSS support as Puppeteer and Playwright. In most cases, your HTML migrates as-is. For detailed CSS tips and workarounds during migration, see HTML to PDF CSS tips.
One thing to note: if you used page.emulateMediaType('screen') in Puppeteer, the API defaults to the print media type. Check your CSS @media print and @media screen rules.
Security Improvements
Self-hosting Puppeteer or Playwright requires managing SSRF risks and Chromium vulnerabilities yourself. A managed API handles these on the service side. For details, see the PDF API Security Guide.
Conclusion
Migrating from Puppeteer or Playwright to a managed API maintains PDF quality while dramatically reducing operational overhead.
- Less code: No browser pool, timeout handling, or process management
- Lighter infrastructure: No Chromium in your Docker image
- Scalability: Concurrency handled by the API
- Stability: No more OOM crashes or zombie processes
- Cost: Server memory savings often offset API costs entirely
Start by testing your HTML in the Playground to verify output quality. The free plan lets you test before committing. See the API documentation for full endpoint details.
After migration, use the PDF API production checklist to harden your deployment, and the PDF API error handling guide to build resilient retry logic.
Related
- HTML to PDF API Comparison 2026 — Detailed Puppeteer vs alternatives comparison
- PDF API Quickstart — Node.js/Python/PHP code examples
- PDF Template Engine Guide — Better HTML management with templates
- wkhtmltopdf vs Chromium — Technical engine comparison
- PDF API Security Guide — Security best practices
- PDF API Production Checklist — Post-migration production hardening
- PDF API Error Handling Guide — Retry logic and error handling patterns
- Serverless PDF API Guide — Lambda and Cloudflare Workers deployment
- Docker/Kubernetes PDF API Guide — Container-based deployment
- HTML to PDF CSS Tips — CSS compatibility and migration workarounds