May 9, 2026

Async PDF Generation Status: Polling vs Webhook

PDF APIwebhookasyncpollingAPI integration

Synchronous PDF generation — POST HTML, wait for the response, receive PDF bytes — works well for simple documents. For large batches, complex templates, or high-concurrency workloads, asynchronous generation is the better model: submit a job, then track its status separately.

There are two strategies for tracking async job status: polling (you check the API repeatedly until the job is done) and webhooks (the API notifies your endpoint when the job completes). Each has different trade-offs.

This article covers both strategies with working code. For full webhook setup including Slack integration, see the PDF API Webhook Integration Guide. For batch processing patterns, see the PDF API Batch Processing Guide.

Why Async PDF Generation

Synchronous generation is fine when:

  • You are generating a single PDF per request
  • The generation time is under 5 seconds
  • The client (browser or mobile app) can wait for the response

Async generation becomes necessary when:

  • Templates are complex and rendering takes 10+ seconds
  • You are generating hundreds or thousands of PDFs at once
  • You want to decouple PDF generation from the HTTP request lifecycle
  • You need to retry failed jobs without re-running the full request

For batch jobs, the PDF API batch processing guide covers concurrency control and queue patterns.

Strategy 1: Polling

How Polling Works

1. POST /pdf/generate  →  { job_id: "pdf_abc123", status: "queued" }
2. GET /pdf/status/pdf_abc123  →  { status: "processing" }
3. GET /pdf/status/pdf_abc123  →  { status: "processing" }
4. GET /pdf/status/pdf_abc123  →  { status: "done", url: "https://..." }

Your code checks the status endpoint at regular intervals until the job completes or fails.

Basic Polling (Node.js)

async function pollUntilDone(jobId, intervalMs = 2000, maxAttempts = 30) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await fetch(
      `https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
      {
        headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY },
      }
    );

    if (!response.ok) throw new Error(`Status check failed: ${response.status}`);

    const { status, url, error } = await response.json();

    if (status === 'done') return url;
    if (status === 'failed') throw new Error(`PDF generation failed: ${error}`);

    // Still processing — wait before next check
    if (attempt < maxAttempts) {
      await new Promise(r => setTimeout(r, intervalMs));
    }
  }

  throw new Error(`PDF job ${jobId} did not complete after ${maxAttempts} attempts`);
}

// Usage
async function generateAndWait(html) {
  // Step 1: Submit the job
  const submitResp = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate-async', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ html, options: { format: 'A4' } }),
  });
  const { job_id } = await submitResp.json();

  // Step 2: Poll until done
  const pdfUrl = await pollUntilDone(job_id);
  return pdfUrl;
}

Exponential Backoff Polling

A fixed interval wastes requests for long-running jobs and adds latency for quick ones. Exponential backoff adjusts the wait time dynamically.

async function pollWithBackoff(jobId, {
  initialIntervalMs = 500,
  maxIntervalMs = 10000,
  maxAttempts = 20,
} = {}) {
  let intervalMs = initialIntervalMs;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await fetch(
      `https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
      { headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY } }
    );

    const { status, url, error } = await response.json();

    if (status === 'done') return url;
    if (status === 'failed') throw new Error(error);

    // Exponential backoff with jitter
    const jitter = Math.random() * 200;
    await new Promise(r => setTimeout(r, intervalMs + jitter));

    // Double the interval up to the cap
    intervalMs = Math.min(intervalMs * 2, maxIntervalMs);
  }

  throw new Error(`Polling timed out after ${maxAttempts} attempts`);
}

The jitter (random offset) prevents multiple clients from all checking at the same moment — important when polling from multiple workers.

Polling in Python

import os
import time
import requests

def poll_until_done(
    job_id: str,
    initial_interval: float = 0.5,
    max_interval: float = 10.0,
    max_attempts: int = 20,
) -> str:
    interval = initial_interval
    api_key = os.environ["FUNBREW_PDF_API_KEY"]

    for attempt in range(1, max_attempts + 1):
        resp = requests.get(
            f"https://pdf.funbrew.cloud/api/v1/pdf/status/{job_id}",
            headers={"X-API-Key": api_key},
            timeout=10,
        )
        resp.raise_for_status()
        data = resp.json()

        if data["status"] == "done":
            return data["url"]
        if data["status"] == "failed":
            raise RuntimeError(f"PDF job failed: {data.get('error')}")

        if attempt < max_attempts:
            time.sleep(interval)
            interval = min(interval * 2, max_interval)

    raise TimeoutError(f"Job {job_id} did not complete in {max_attempts} attempts")

Polling Trade-offs

Pros of polling:

  • Simple to implement — no public endpoint needed
  • Works in any environment (serverless, CLI scripts, cron jobs)
  • Easy to debug — you can see every status check in logs
  • No webhook infrastructure required

Cons of polling:

  • Wasted requests when the job takes longer than expected
  • Added latency equal to one polling interval after job completion
  • At high concurrency, many simultaneous pollers increase API load
  • Keeping a process alive during polling ties up server resources

Polling is the right choice for low-to-medium concurrency use cases, CLI tools, scripts, and any environment where receiving incoming HTTP requests is not possible.

Strategy 2: Webhooks

How Webhooks Work

1. Configure a webhook URL in the FUNBREW PDF dashboard
2. POST /pdf/generate  →  { job_id: "pdf_abc123" }  (returns immediately)
3. [API generates PDF in background]
4. API calls your endpoint: POST /webhooks/pdf  →  { event: "pdf.generated", data: {...} }

Your code receives a push notification when the job completes, rather than checking repeatedly.

Webhook Endpoint (Node.js / Express)

const express = require('express');
const crypto  = require('crypto');
const app     = express();

app.use(express.json());

app.post('/webhooks/pdf', (req, res) => {
  // Step 1: Verify the signature
  const signature = req.headers['x-funbrew-signature'];
  const expected  = crypto
    .createHmac('sha256', process.env.WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex');

  if (!crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  // Step 2: Acknowledge receipt immediately (before processing)
  res.status(200).json({ received: true });

  // Step 3: Handle the event asynchronously
  const { event, data } = req.body;
  handlePdfEvent(event, data).catch(err => console.error('Webhook handler error:', err));
});

async function handlePdfEvent(event, data) {
  switch (event) {
    case 'pdf.generated':
      console.log(`PDF ready: ${data.filename} (${data.file_size} bytes)`);
      // Download, store, or notify the user
      await notifyUser(data.job_id, data.url);
      break;

    case 'pdf.failed':
      console.error(`PDF failed: ${data.job_id} — ${data.error}`);
      await recordFailure(data.job_id, data.error);
      break;

    case 'batch.completed':
      console.log(`Batch done: ${data.succeeded}/${data.total} succeeded`);
      break;
  }
}

app.listen(3000);

Webhook Endpoint (Python / FastAPI)

import hmac
import hashlib
import os
from fastapi import FastAPI, Request, HTTPException, BackgroundTasks

app = FastAPI()

@app.post("/webhooks/pdf")
async def handle_webhook(request: Request, background_tasks: BackgroundTasks):
    body = await request.body()
    signature = request.headers.get("x-funbrew-signature", "")

    # Verify signature
    expected = hmac.new(
        os.environ["WEBHOOK_SECRET"].encode(),
        body,
        hashlib.sha256,
    ).hexdigest()

    if not hmac.compare_digest(signature, expected):
        raise HTTPException(status_code=401, detail="Invalid signature")

    # Acknowledge immediately
    payload = await request.json()
    background_tasks.add_task(process_event, payload["event"], payload["data"])

    return {"received": True}

async def process_event(event: str, data: dict):
    if event == "pdf.generated":
        print(f"PDF ready: {data['filename']}")
        await notify_user(data["job_id"], data["url"])
    elif event == "pdf.failed":
        print(f"PDF failed: {data['job_id']} — {data['error']}")

Signature Verification

Always verify the webhook signature before processing the payload. Without verification, anyone who knows your endpoint URL can send fake events.

// Correct: use timing-safe comparison
const isValid = crypto.timingSafeEqual(
  Buffer.from(signature, 'hex'),
  Buffer.from(expected, 'hex')
);

// Wrong: use === (vulnerable to timing attacks)
// const isValid = signature === expected;

Using timingSafeEqual (Node.js) or hmac.compare_digest (Python) prevents timing attacks where an attacker could infer the expected signature by measuring response time.

Handling Webhook Retries

The FUNBREW PDF API retries webhook delivery if your endpoint returns a non-2xx status code or times out. Your endpoint must handle duplicate deliveries gracefully.

// Idempotent handler: check if the job was already processed
async function handlePdfEvent(event, data) {
  if (event !== 'pdf.generated') return;

  // Check if already processed
  const existing = await db.query(
    'SELECT id FROM processed_pdfs WHERE job_id = $1',
    [data.job_id]
  );
  if (existing.rows.length > 0) {
    console.log(`Job ${data.job_id} already processed — skipping`);
    return;
  }

  // Process and mark as done
  await downloadAndStorePdf(data.url, data.filename);
  await db.query(
    'INSERT INTO processed_pdfs (job_id, processed_at) VALUES ($1, NOW())',
    [data.job_id]
  );
}

Store processed job IDs and skip re-processing if you see the same job_id twice. This is the standard idempotency pattern for webhook consumers.

Webhook Trade-offs

Pros of webhooks:

  • Real-time notification — no polling delay
  • No wasted requests on checking status
  • Scales well: 1,000 concurrent jobs generate 1,000 delivery calls, not N × 1,000 polling calls
  • Your server is free while the PDF generates

Cons of webhooks:

  • Requires a publicly accessible HTTP endpoint
  • Not suitable for CLI scripts or one-off jobs
  • More complex to set up (endpoint, signature verification, retry handling)
  • Local development requires a tunnel (ngrok, etc.)
  • You must handle idempotency (duplicate deliveries)

Webhooks are the right choice for production applications where you are processing many PDFs and want real-time status updates with minimal API load.

Choosing Between Polling and Webhooks

Situation Recommendation
Script or CLI tool Polling
Serverless function triggered per request Polling
Background worker processing many jobs Webhooks
Real-time user notification required Webhooks
Local development or testing Polling (simpler)
High concurrency (100+ concurrent jobs) Webhooks
Public endpoint not available Polling

For most production web applications, webhooks are preferable. For scripting and tooling, polling is simpler. Both strategies compose well: use polling in development and webhooks in production.

Hybrid: Polling with Webhook Fallback

For critical jobs where you cannot miss a completion event, combine both:

async function generatePdfReliably(html) {
  const { job_id } = await submitJob(html);

  // Primary: wait for webhook (handled separately)
  // Fallback: poll in case webhook delivery fails
  const WEBHOOK_TIMEOUT_MS = 60_000; // 60 seconds
  const startTime = Date.now();

  while (Date.now() - startTime < WEBHOOK_TIMEOUT_MS) {
    const { status, url } = await checkStatus(job_id);

    if (status === 'done') return url;
    if (status === 'failed') throw new Error('PDF generation failed');

    await new Promise(r => setTimeout(r, 5_000)); // 5-second poll interval
  }

  throw new Error(`Job ${job_id} did not complete within timeout`);
}

In this pattern, the webhook handler updates a shared store (database or cache) when it fires. The polling loop reads from that store. Whichever arrives first wins, and the other is a no-op.

Error Handling and Retry

Both polling and webhook handlers should implement retry logic for transient failures.

async function downloadPdfWithRetry(url, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url);
      if (!response.ok) throw new Error(`HTTP ${response.status}`);
      return Buffer.from(await response.arrayBuffer());
    } catch (err) {
      if (attempt === maxRetries) throw err;
      // Exponential backoff: 1s, 2s, 4s
      await new Promise(r => setTimeout(r, 1000 * 2 ** (attempt - 1)));
    }
  }
}

For comprehensive error handling patterns including rate-limit handling and 5xx recovery, see the PDF API error handling guide.

Summary

Polling Webhooks
Setup complexity Low Medium
Real-time notification No (interval delay) Yes
Requires public endpoint No Yes
Wasted API calls Yes (checking status) No
Good for scripts/CLI Yes No
Good for production apps Depends on concurrency Yes
Idempotency handling Not needed Required

Start with polling for simplicity. Move to webhooks when you have multiple concurrent jobs, need real-time status in a production app, or want to minimize unnecessary API calls.

Related

Powered by FUNBREW PDF