Invalid Date

Running Puppeteer or Playwright for PDF generation in production comes with operational baggage: Chromium version management, memory consumption, concurrency handling, and more — none of which are related to actually generating PDFs.

This guide covers how to migrate from Puppeteer or Playwright to a managed API, with concrete before/after code in Node.js, Python, and PHP.

The Puppeteer / Playwright Operations Tax

Common problems when running Puppeteer or Playwright in production:

Memory Consumption

Chromium uses 200–500MB per process. Generating multiple PDFs concurrently can spike memory usage and cause OOM (Out of Memory) crashes.

Chromium Management

Puppeteer and Playwright version upgrades change the bundled Chromium version. CI/CD pipelines break when Chromium downloads fail, or when OS compatibility issues arise.

Concurrency Implementation

Calling puppeteer.launch() or playwright.chromium.launch() per request is slow. Building a browser pool is complex — you need to handle tab leaks, zombie processes, and graceful shutdowns.

Cold Starts

In serverless environments (AWS Lambda, etc.), Chromium startup takes seconds, making it hard to meet latency requirements. A managed API eliminates cold starts entirely. For serverless-specific setup, see the serverless PDF API guide.

For technical background on these issues, see wkhtmltopdf vs Chromium.

Before / After: Code Comparison

Before: Puppeteer (Self-managed)

const puppeteer = require('puppeteer');

// Browser pool management required
let browser;

async function getBrowser() {
  if (!browser || !browser.isConnected()) {
    browser = await puppeteer.launch({
      headless: 'new',
      args: [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage',  // Memory workaround
        '--disable-gpu',
      ],
    });
  }
  return browser;
}

async function generatePdf(html) {
  const browser = await getBrowser();
  const page = await browser.newPage();

  try {
    await page.setContent(html, { waitUntil: 'networkidle0' });
    const pdf = await page.pdf({
      format: 'A4',
      margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
      printBackground: true,
    });
    return pdf;
  } finally {
    await page.close();  // Prevent tab leaks
  }
}

// You also need:
// - Browser pool management
// - Timeout handling
// - Zombie process detection and restart
// - Memory monitoring
// - Chromium version management

After: Managed API

async function generatePdf(html) {
  const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      html,
      options: {
        format: 'A4',
        margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
        print_background: true,
      },
    }),
  });

  if (!response.ok) throw new Error(`API error: ${response.status}`);
  return Buffer.from(await response.arrayBuffer());
}

// All you need: an API key
// Browser management, memory, concurrency — all handled by the API

Code is dramatically simpler. All Chromium-related operational code is eliminated.

Migrating from Playwright

If you're using Playwright instead of Puppeteer, the migration path is identical. Playwright offers more features than Puppeteer, but for PDF generation the operational burden is just as high.

Before: Playwright (Node.js)

const { chromium } = require('playwright');

// Playwright also requires browser lifecycle management
let browser;

async function getBrowser() {
  if (!browser) {
    browser = await chromium.launch({
      args: ['--no-sandbox', '--disable-setuid-sandbox'],
    });
  }
  return browser;
}

async function generatePdfWithPlaywright(html) {
  const browser = await getBrowser();
  const context = await browser.newContext();
  const page = await context.newPage();

  try {
    await page.setContent(html, { waitUntil: 'networkidle' });
    const pdf = await page.pdf({
      format: 'A4',
      margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
      printBackground: true,
    });
    return pdf;
  } finally {
    await context.close();
  }
}

// Installation alone pulls hundreds of MB:
// npx playwright install chromium

Before: Playwright (Python)

import asyncio
from playwright.async_api import async_playwright

async def generate_pdf_playwright(html: str) -> bytes:
    async with async_playwright() as p:
        # Chromium launch takes 1–3 seconds
        browser = await p.chromium.launch()
        context = await browser.new_context()
        page = await context.new_page()

        await page.set_content(html, wait_until="networkidle")
        pdf = await page.pdf(
            format="A4",
            margin={"top": "20mm", "bottom": "20mm",
                    "left": "15mm", "right": "15mm"},
            print_background=True,
        )
        await browser.close()
        return pdf

# pip install playwright
# playwright install chromium  # Downloads hundreds of MB

After: Managed API (Same code whether you're migrating from Puppeteer or Playwright)

The API call is identical regardless of which browser automation library you're coming from.

# After: FUNBREW PDF API (works for both Puppeteer and Playwright migrations)
import os
import requests

def generate_pdf(html: str) -> bytes:
    resp = requests.post(
        "https://pdf.funbrew.cloud/api/v1/pdf/generate",
        headers={
            "X-API-Key": os.environ["FUNBREW_PDF_API_KEY"],
            "Content-Type": "application/json",
        },
        json={
            "html": html,
            "options": {
                "format": "A4",
                "margin": {"top": "20mm", "bottom": "20mm",
                           "left": "15mm", "right": "15mm"},
                "print_background": True,
            },
        },
        timeout=30,
    )
    resp.raise_for_status()
    return resp.content

The waitUntil: 'networkidle' behaviour Playwright uses is handled automatically on the API server side — no equivalent parameter needed.

Cost Breakdown: Self-Hosted vs Managed API

"Won't using an API increase costs?" is a fair question. Let's work through the numbers.

Self-Hosted Costs

Running Puppeteer or Playwright in production involves multiple cost buckets.

Server costs

Chromium consumes 200–500MB per process. Handling 1,000 PDF jobs per month with any concurrency requires at least 2GB RAM.

Example: AWS t3.medium (2 vCPU / 4GB RAM)
- On-demand: ~$33/month
- Additional instances at peak: variable
- Data transfer: variable

Engineering and maintenance costs

Initial implementation (browser pool + timeouts + retries): 8–16 hours
Puppeteer/Playwright/Chromium version bumps: 2–4 hours per update (3–4× per year)
Incident response (OOM, zombie processes): 2–8 hours per incident (0–2× per month)
Estimated annual maintenance: 20–50 engineer-hours

CI/CD costs

Chromium downloads slow down your build pipeline. Docker images balloon to 1GB+, increasing push/pull times and registry storage.

Managed API Costs

FUNBREW PDF free tier: 30 PDFs/month at no cost
Paid plans: usage-based pricing
Maintenance cost: near zero (only handle API client changes)

Where the Break-Even Is

Low volume (< 100 PDFs/month):
  Self-hosted: $10–30/month server + engineer time
  API: free or very low cost
  → API wins clearly

Medium volume (100–10,000 PDFs/month):
  Self-hosted: $30–100/month server + maintenance hours
  API: usage-based pricing
  → When maintenance hours are factored in, API is usually cheaper

High volume (10,000+ PDFs/month):
  Self-hosted: dedicated infrastructure, dedicated ops time
  API: scales on the API side
  → Depends on your specifics, but hard to beat a team that does only this

The often-missed cost: engineer opportunity cost. Every hour spent on Chromium management is an hour not spent on your product.

Python Migration Examples (Expanded)

Python: httpx (Async)

In addition to the synchronous requests approach, here's an async implementation using httpx, suitable for FastAPI, Django async views, or any async Python application.

# httpx async implementation (FastAPI, Django Channels, etc.)
import os
import httpx

API_URL = "https://pdf.funbrew.cloud/api/v1/pdf/generate"
API_KEY = os.environ["FUNBREW_PDF_API_KEY"]

async def generate_pdf_async(html: str) -> bytes:
    async with httpx.AsyncClient(timeout=30.0) as client:
        response = await client.post(
            API_URL,
            headers={
                "X-API-Key": API_KEY,
                "Content-Type": "application/json",
            },
            json={
                "html": html,
                "options": {
                    "format": "A4",
                    "margin": {
                        "top": "20mm", "bottom": "20mm",
                        "left": "15mm", "right": "15mm",
                    },
                    "print_background": True,
                },
            },
        )
        response.raise_for_status()
        return response.content
# FastAPI usage example
from fastapi import FastAPI
from fastapi.responses import Response

app = FastAPI()

@app.post("/invoice/pdf")
async def create_invoice_pdf(html: str):
    pdf_bytes = await generate_pdf_async(html)
    return Response(
        content=pdf_bytes,
        media_type="application/pdf",
        headers={"Content-Disposition": "attachment; filename=invoice.pdf"},
    )
# Django view (synchronous)
import os
import requests
from django.http import HttpResponse

def generate_pdf_view(request):
    html = render_invoice_html(request)  # Render your template to HTML

    resp = requests.post(
        "https://pdf.funbrew.cloud/api/v1/pdf/generate",
        headers={
            "X-API-Key": os.environ["FUNBREW_PDF_API_KEY"],
            "Content-Type": "application/json",
        },
        json={
            "html": html,
            "options": {"format": "A4", "print_background": True},
        },
        timeout=30,
    )
    resp.raise_for_status()

    return HttpResponse(
        resp.content,
        content_type="application/pdf",
        headers={"Content-Disposition": 'attachment; filename="invoice.pdf"'},
    )

PHP Migration Examples (Expanded)

PHP: Guzzle

Beyond the curl example, here's a Guzzle-based implementation that fits cleanly into Laravel, Symfony, and other PHP frameworks.

<?php
// Guzzle implementation (recommended for Laravel / Symfony)
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

class PdfGenerator
{
    private Client $client;
    private string $apiKey;

    public function __construct()
    {
        $this->client = new Client([
            'base_uri' => 'https://pdf.funbrew.cloud',
            'timeout'  => 30.0,
        ]);
        $this->apiKey = env('FUNBREW_PDF_API_KEY');
    }

    public function generate(string $html, array $options = []): string
    {
        try {
            $response = $this->client->post('/api/v1/pdf/generate', [
                'headers' => [
                    'X-API-Key'    => $this->apiKey,
                    'Content-Type' => 'application/json',
                ],
                'json' => [
                    'html'    => $html,
                    'options' => array_merge([
                        'format' => 'A4',
                        'margin' => [
                            'top'    => '20mm',
                            'bottom' => '20mm',
                            'left'   => '15mm',
                            'right'  => '15mm',
                        ],
                        'print_background' => true,
                    ], $options),
                ],
            ]);

            return (string) $response->getBody();
        } catch (RequestException $e) {
            $statusCode = $e->getResponse()?->getStatusCode() ?? 0;
            throw new \RuntimeException("PDF generation failed: HTTP {$statusCode}", $statusCode, $e);
        }
    }
}
<?php
// Laravel controller usage example
use App\Services\PdfGenerator;
use Illuminate\Http\Response;

class InvoiceController extends Controller
{
    public function __construct(private PdfGenerator $pdfGenerator) {}

    public function download(Invoice $invoice): Response
    {
        $html = view('invoices.pdf', compact('invoice'))->render();
        $pdf  = $this->pdfGenerator->generate($html);

        return response($pdf, 200, [
            'Content-Type'        => 'application/pdf',
            'Content-Disposition' => "attachment; filename=\"invoice-{$invoice->id}.pdf\"",
        ]);
    }
}

PHP: Before / After (spatie/browsershot)

The most common PHP starting point is spatie/browsershot, which shells out to Node.js and Puppeteer under the hood.

<?php
// Before: spatie/browsershot (delegates to Node.js + Puppeteer internally)
use Spatie\Browsershot\Browsershot;

function generatePdf(string $html): string
{
    return Browsershot::html($html)
        ->format('A4')
        ->margins(20, 15, 20, 15)
        ->showBackground()
        ->pdf();
    // Requires Node.js + Puppeteer installed on the server.
    // Spawns a Chromium process on every call.
}
<?php
// After: FUNBREW PDF API (PHP curl — no Node.js or Puppeteer needed)
function generatePdf(string $html): string
{
    $apiKey = getenv('FUNBREW_PDF_API_KEY');

    $ch = curl_init('https://pdf.funbrew.cloud/api/v1/pdf/generate');
    curl_setopt_array($ch, [
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST           => true,
        CURLOPT_TIMEOUT        => 30,
        CURLOPT_HTTPHEADER     => [
            'X-API-Key: ' . $apiKey,
            'Content-Type: application/json',
        ],
        CURLOPT_POSTFIELDS => json_encode([
            'html'    => $html,
            'options' => [
                'format' => 'A4',
                'margin' => [
                    'top' => '20mm', 'bottom' => '20mm',
                    'left' => '15mm', 'right' => '15mm',
                ],
            ],
        ]),
    ]);

    $body   = curl_exec($ch);
    $status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);

    if ($status !== 200) {
        throw new \RuntimeException("PDF API error: HTTP $status");
    }

    return $body; // PDF binary
}

// Dependency: curl only (default PHP extension). Node.js and Puppeteer removed.

Performance Comparison

Here are representative numbers comparing self-hosted Puppeteer/Playwright against a managed API. Actual results vary by environment.

Response Times

Scenario Puppeteer (Cold start) Puppeteer (Warm) Managed API
Simple HTML (~10KB) 3,000–5,000ms 500–800ms 300–600ms
HTML with images (~100KB) 4,000–8,000ms 800–1,500ms 500–1,200ms
Complex CSS + images 6,000–15,000ms 1,500–3,000ms 800–2,000ms
Serverless (Lambda) 8,000–15,000ms 300–600ms

"Cold start" includes Chromium launch time. Even a warm browser pool is typically slower than API responses, because the API infrastructure stays hot and scales horizontally.

Memory Usage

Puppeteer (browser pool, 3 processes):
  Idle:           600MB–1.5GB
  During PDF gen: 1GB–2GB+
  Peak (10 concurrent): 3GB+

Managed API:
  Idle:           <10MB (HTTP client only)
  During PDF gen: <50MB (handling HTTP response)
  Peak (100 concurrent): similar — no scaling needed on your server

Docker Image Size

# Before: Image with Puppeteer
FROM node:20
RUN apt-get install -y chromium   # +300MB
RUN npm install puppeteer          # +200MB
# Result: 1GB+

# After: API client only
FROM node:20-alpine
# No chromium needed
# Result: ~150MB

Smaller images mean faster CI builds, faster deploys, and lower container registry storage costs.

Concurrency

Puppeteer (browser pool on a 2GB server):
  Max concurrent:  3–5 jobs
  Beyond that:     queue or OOM

Managed API:
  Max concurrent:  effectively unlimited (plan-dependent rate limits)
  Scaling:         handled by the API side

Migration Steps

Step 1: Verify API Output

Paste your HTML into the Playground and compare the output with Puppeteer's. Use the free plan (30/month) to test actual API responses.

Step 2: Replace the PDF Generation Function

Swap the Puppeteer or Playwright call with an API call. For language-specific examples, see the quickstart guide.

Step 3: Migrate Templates

If you're building HTML with string concatenation or template literals, consider migrating to the template engine for better maintainability.

// Before: String concatenation
const html = `<h1>Invoice</h1><p>${customerName}</p>...`;
const pdf = await generatePdfWithPuppeteer(html);

// After: Template + API
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/from-template', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    template: 'invoice',
    variables: { customer_name: customerName, total: '1,650.00' },
  }),
});

Step 4: Remove Puppeteer/Playwright Dependencies

Once the migration is complete, clean up.

# For Puppeteer
npm uninstall puppeteer

# For Playwright
npm uninstall playwright @playwright/test

# Remove chromium install lines from Dockerfile
# Remove Chromium cache config from CI/CD

Migration Checklist

Use this checklist to keep your migration on track.

Pre-migration
  [ ] Paste HTML into the Playground and compare output quality
  [ ] Sign up for the free plan (30/month) and obtain an API key
  [ ] Set FUNBREW_PDF_API_KEY in your environment
  [ ] Replace the PDF generation function with the API call
  [ ] Visual diff the PDF output in a staging environment

CSS review
  [ ] Audit @media print vs @media screen usage
  [ ] Verify page-break-before/after behaviour
  [ ] Convert external fonts (Google Fonts, etc.) to inline CSS or Base64 if needed
  [ ] Confirm background colours and images render with print settings

Production cutover
  [ ] Add error handling (retry on 429, alert on 5xx)
  [ ] Set a 30-second client timeout
  [ ] Run: npm uninstall puppeteer (or playwright)
  [ ] Remove Chromium install lines from your Dockerfile
  [ ] Remove Chromium cache steps from CI/CD
  [ ] Revisit server memory allocation (Chromium is no longer needed)

Common Migration Issues and Fixes

Issue 1: External resources (images, fonts) not rendering

Cause: The API server cannot reach localhost or private-network URLs.

Fix: Embed images as Base64 data URIs, or serve them from a public URL.

const fs = require('fs');
const path = require('path');

// Convert a local image to a Base64 data URI
function embedLocalImage(imagePath) {
  const ext  = path.extname(imagePath).slice(1); // 'png', 'jpg', etc.
  const data = fs.readFileSync(imagePath).toString('base64');
  return `data:image/${ext};base64,${data}`;
}

const html = `
  <img src="${embedLocalImage('./logo.png')}" alt="Logo" />
  <p>Logo is embedded as Base64 — no external request needed.</p>
`;

Issue 2: Fonts not applied

Cause: External font requests (Google Fonts, etc.) time out or are blocked by network policy.

Fix: Embed the font as Base64 in your CSS. For Japanese text, FUNBREW PDF includes Japanese fonts pre-installed — no extra setup needed.

<style>
  /* Option A: Google Fonts URL (works when the API can reach the internet) */
  @import url('https://fonts.googleapis.com/css2?family=Inter&display=swap');

  /* Option B: Base64-embedded font (always works) */
  @font-face {
    font-family: 'MyFont';
    src: url('data:font/woff2;base64,d09GMgAB...') format('woff2');
  }

  body { font-family: 'Inter', sans-serif; }
</style>

Issue 3: @media screen styles not applied

Cause: If you called page.emulateMediaType('screen') in Puppeteer, note that the API defaults to the print media type.

Fix: Update your CSS to target @media print, or remove the media query so styles apply unconditionally.

/* Before: screen-only (ignored by the API's print media type) */
@media screen {
  .invoice { padding: 40px; background: #fff; }
}

/* After: target print explicitly */
@media print {
  .invoice { padding: 40px; background: #fff; }
}

/* Or drop the media query to apply in both contexts */
.invoice { padding: 40px; background: #fff; }

Issue 4: Timeout errors on complex pages

Cause: Large images or complex CSS take longer to render.

Fix: Increase the client timeout to at least 30 seconds and add exponential-backoff retry logic.

async function generatePdfWithRetry(html, options = {}, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const controller = new AbortController();
      const timeoutId  = setTimeout(() => controller.abort(), 30_000); // 30 s

      const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
        method: 'POST',
        headers: {
          'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({ html, options }),
        signal: controller.signal,
      });

      clearTimeout(timeoutId);

      if (response.status === 429 && attempt < maxRetries) {
        // Rate-limited: back off and retry
        await new Promise(r => setTimeout(r, 2 ** attempt * 1000));
        continue;
      }

      if (!response.ok) throw new Error(`API error: ${response.status}`);
      return Buffer.from(await response.arrayBuffer());
    } catch (err) {
      if (attempt === maxRetries) throw err;
      await new Promise(r => setTimeout(r, 1000 * attempt));
    }
  }
}

Issue 5: Playwright waitForSelector equivalents

Cause: You used page.waitForSelector() to wait for dynamic content before generating the PDF.

Fix: The cleanest solution is to render the HTML completely on the server side before sending it to the API. The API does not run JavaScript from your application — it renders the HTML you provide.

// Before: Playwright waiting for dynamic content
await page.waitForSelector('.invoice-total');
const pdf = await page.pdf();

// After: Render complete HTML server-side, then send to API
const html = await renderInvoiceServerSide(invoiceData); // SSR produces complete HTML
const pdf = await generatePdf(html); // No waiting needed

Issue 6: page.evaluate() used to manipulate the DOM before export

Cause: page.evaluate() was used to inject values into the DOM just before calling page.pdf().

Fix: Move that logic into your HTML template. Generate the final HTML (with values substituted) before sending it to the API.

// Before: DOM manipulation via evaluate()
await page.evaluate((total) => {
  document.querySelector('#total').textContent = total;
}, '$1,650.00');
const pdf = await page.pdf();

// After: Embed values in the HTML template
const html = `
  <div id="total">${total}</div>
`;
const pdf = await generatePdf(html);

What Improves After Migration

Aspect Puppeteer/Playwright (Self-managed) Managed API
Memory usage 200–500MB per process Near zero (HTTP calls only)
Cold start 1–3 seconds None (always running)
Concurrency Manual implementation Auto-scaling by API
Chromium management Version pinning, compatibility Not needed
Serverless support Lambda layers, workarounds Works out of the box
Docker image size 1GB+ (with Chromium) Lightweight (no Chromium)
Lambda cold start 8–15 seconds 300–600ms

CSS Compatibility

FUNBREW PDF's quality engine is Chromium-based, so it has the same CSS support as Puppeteer and Playwright. In most cases, your HTML migrates as-is. For detailed CSS tips and workarounds during migration, see HTML to PDF CSS tips.

One thing to note: if you used page.emulateMediaType('screen') in Puppeteer, the API defaults to the print media type. Check your CSS @media print and @media screen rules.

Security Improvements

Self-hosting Puppeteer or Playwright requires managing SSRF risks and Chromium vulnerabilities yourself. A managed API handles these on the service side. For details, see the PDF API Security Guide.

Conclusion

Migrating from Puppeteer or Playwright to a managed API maintains PDF quality while dramatically reducing operational overhead.

  • Less code: No browser pool, timeout handling, or process management
  • Lighter infrastructure: No Chromium in your Docker image
  • Scalability: Concurrency handled by the API
  • Stability: No more OOM crashes or zombie processes
  • Cost: Server memory savings often offset API costs entirely

Start by testing your HTML in the Playground to verify output quality. The free plan lets you test before committing. See the API documentation for full endpoint details.

After migration, use the PDF API production checklist to harden your deployment, and the PDF API error handling guide to build resilient retry logic.

Related

Powered by FUNBREW PDF