May 9, 2026

Async PDF Generation Status: Polling vs Webhook

PDF APIwebhookasyncpollingAPI integration

Synchronous PDF generation — POST HTML, wait for the response, receive PDF bytes — works well for simple documents. For large batches, complex templates, or high-concurrency workloads, asynchronous generation is the better model: submit a job, then track its status separately.

There are three strategies for tracking async job status: polling (you check the API repeatedly until the job is done), webhooks (the API notifies your endpoint when the job completes), and — added in the May 31, 2026 release — SSE (Server-Sent Events) (the server streams real-time progress directly to your browser). Each has different trade-offs.

This article covers polling and webhooks with working code. For the SSE implementation in detail, see the PDF Real-Time Progress Streaming Guide. For full webhook setup including Slack integration, see the PDF API Webhook Integration Guide. For batch processing patterns, see the PDF API Batch Processing Guide.

Why Async PDF Generation

Synchronous generation is fine when:

  • You are generating a single PDF per request
  • The generation time is under 5 seconds
  • The client (browser or mobile app) can wait for the response

Async generation becomes necessary when:

  • Templates are complex and rendering takes 10+ seconds
  • You are generating hundreds or thousands of PDFs at once
  • You want to decouple PDF generation from the HTTP request lifecycle
  • You need to retry failed jobs without re-running the full request

For batch jobs, the PDF API batch processing guide covers concurrency control and queue patterns.

Strategy 1: Polling

How Polling Works

1. POST /pdf/generate  →  { job_id: "pdf_abc123", status: "queued" }
2. GET /pdf/status/pdf_abc123  →  { status: "processing" }
3. GET /pdf/status/pdf_abc123  →  { status: "processing" }
4. GET /pdf/status/pdf_abc123  →  { status: "done", url: "https://..." }

Your code checks the status endpoint at regular intervals until the job completes or fails.

Basic Polling (Node.js)

async function pollUntilDone(jobId, intervalMs = 2000, maxAttempts = 30) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await fetch(
      `https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
      {
        headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY },
      }
    );

    if (!response.ok) throw new Error(`Status check failed: ${response.status}`);

    const { status, url, error } = await response.json();

    if (status === 'done') return url;
    if (status === 'failed') throw new Error(`PDF generation failed: ${error}`);

    // Still processing — wait before next check
    if (attempt < maxAttempts) {
      await new Promise(r => setTimeout(r, intervalMs));
    }
  }

  throw new Error(`PDF job ${jobId} did not complete after ${maxAttempts} attempts`);
}

// Usage
async function generateAndWait(html) {
  // Step 1: Submit the job
  const submitResp = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate-async', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ html, options: { format: 'A4' } }),
  });
  const { job_id } = await submitResp.json();

  // Step 2: Poll until done
  const pdfUrl = await pollUntilDone(job_id);
  return pdfUrl;
}

Exponential Backoff Polling

A fixed interval wastes requests for long-running jobs and adds latency for quick ones. Exponential backoff adjusts the wait time dynamically.

async function pollWithBackoff(jobId, {
  initialIntervalMs = 500,
  maxIntervalMs = 10000,
  maxAttempts = 20,
} = {}) {
  let intervalMs = initialIntervalMs;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await fetch(
      `https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
      { headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY } }
    );

    const { status, url, error } = await response.json();

    if (status === 'done') return url;
    if (status === 'failed') throw new Error(error);

    // Exponential backoff with jitter
    const jitter = Math.random() * 200;
    await new Promise(r => setTimeout(r, intervalMs + jitter));

    // Double the interval up to the cap
    intervalMs = Math.min(intervalMs * 2, maxIntervalMs);
  }

  throw new Error(`Polling timed out after ${maxAttempts} attempts`);
}

The jitter (random offset) prevents multiple clients from all checking at the same moment — important when polling from multiple workers.

Polling in Python

import os
import time
import requests

def poll_until_done(
    job_id: str,
    initial_interval: float = 0.5,
    max_interval: float = 10.0,
    max_attempts: int = 20,
) -> str:
    interval = initial_interval
    api_key = os.environ["FUNBREW_PDF_API_KEY"]

    for attempt in range(1, max_attempts + 1):
        resp = requests.get(
            f"https://pdf.funbrew.cloud/api/v1/pdf/status/{job_id}",
            headers={"X-API-Key": api_key},
            timeout=10,
        )
        resp.raise_for_status()
        data = resp.json()

        if data["status"] == "done":
            return data["url"]
        if data["status"] == "failed":
            raise RuntimeError(f"PDF job failed: {data.get('error')}")

        if attempt < max_attempts:
            time.sleep(interval)
            interval = min(interval * 2, max_interval)

    raise TimeoutError(f"Job {job_id} did not complete in {max_attempts} attempts")

Polling Trade-offs

Pros of polling:

  • Simple to implement — no public endpoint needed
  • Works in any environment (serverless, CLI scripts, cron jobs)
  • Easy to debug — you can see every status check in logs
  • No webhook infrastructure required

Cons of polling:

  • Wasted requests when the job takes longer than expected
  • Added latency equal to one polling interval after job completion
  • At high concurrency, many simultaneous pollers increase API load
  • Keeping a process alive during polling ties up server resources

Polling is the right choice for low-to-medium concurrency use cases, CLI tools, scripts, and any environment where receiving incoming HTTP requests is not possible.

Strategy 2: Webhooks

How Webhooks Work

1. Configure a webhook URL in the FUNBREW PDF dashboard
2. POST /pdf/generate  →  { job_id: "pdf_abc123" }  (returns immediately)
3. [API generates PDF in background]
4. API calls your endpoint: POST /webhooks/pdf  →  { event: "pdf.generated", data: {...} }

Your code receives a push notification when the job completes, rather than checking repeatedly.

Webhook Endpoint (Node.js / Express)

const express = require('express');
const crypto  = require('crypto');
const app     = express();

app.use(express.json());

app.post('/webhooks/pdf', (req, res) => {
  // Step 1: Verify the signature
  const signature = req.headers['x-funbrew-signature'];
  const expected  = crypto
    .createHmac('sha256', process.env.WEBHOOK_SECRET)
    .update(JSON.stringify(req.body))
    .digest('hex');

  if (!crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
    return res.status(401).json({ error: 'Invalid signature' });
  }

  // Step 2: Acknowledge receipt immediately (before processing)
  res.status(200).json({ received: true });

  // Step 3: Handle the event asynchronously
  const { event, data } = req.body;
  handlePdfEvent(event, data).catch(err => console.error('Webhook handler error:', err));
});

async function handlePdfEvent(event, data) {
  switch (event) {
    case 'pdf.generated':
      console.log(`PDF ready: ${data.filename} (${data.file_size} bytes)`);
      // Download, store, or notify the user
      await notifyUser(data.job_id, data.url);
      break;

    case 'pdf.failed':
      console.error(`PDF failed: ${data.job_id} — ${data.error}`);
      await recordFailure(data.job_id, data.error);
      break;

    case 'batch.completed':
      console.log(`Batch done: ${data.succeeded}/${data.total} succeeded`);
      break;
  }
}

app.listen(3000);

Webhook Endpoint (Python / FastAPI)

import hmac
import hashlib
import os
from fastapi import FastAPI, Request, HTTPException, BackgroundTasks

app = FastAPI()

@app.post("/webhooks/pdf")
async def handle_webhook(request: Request, background_tasks: BackgroundTasks):
    body = await request.body()
    signature = request.headers.get("x-funbrew-signature", "")

    # Verify signature
    expected = hmac.new(
        os.environ["WEBHOOK_SECRET"].encode(),
        body,
        hashlib.sha256,
    ).hexdigest()

    if not hmac.compare_digest(signature, expected):
        raise HTTPException(status_code=401, detail="Invalid signature")

    # Acknowledge immediately
    payload = await request.json()
    background_tasks.add_task(process_event, payload["event"], payload["data"])

    return {"received": True}

async def process_event(event: str, data: dict):
    if event == "pdf.generated":
        print(f"PDF ready: {data['filename']}")
        await notify_user(data["job_id"], data["url"])
    elif event == "pdf.failed":
        print(f"PDF failed: {data['job_id']} — {data['error']}")

Signature Verification

Always verify the webhook signature before processing the payload. Without verification, anyone who knows your endpoint URL can send fake events.

// Correct: use timing-safe comparison
const isValid = crypto.timingSafeEqual(
  Buffer.from(signature, 'hex'),
  Buffer.from(expected, 'hex')
);

// Wrong: use === (vulnerable to timing attacks)
// const isValid = signature === expected;

Using timingSafeEqual (Node.js) or hmac.compare_digest (Python) prevents timing attacks where an attacker could infer the expected signature by measuring response time.

Handling Webhook Retries

The FUNBREW PDF API retries webhook delivery if your endpoint returns a non-2xx status code or times out. Your endpoint must handle duplicate deliveries gracefully.

// Idempotent handler: check if the job was already processed
async function handlePdfEvent(event, data) {
  if (event !== 'pdf.generated') return;

  // Check if already processed
  const existing = await db.query(
    'SELECT id FROM processed_pdfs WHERE job_id = $1',
    [data.job_id]
  );
  if (existing.rows.length > 0) {
    console.log(`Job ${data.job_id} already processed — skipping`);
    return;
  }

  // Process and mark as done
  await downloadAndStorePdf(data.url, data.filename);
  await db.query(
    'INSERT INTO processed_pdfs (job_id, processed_at) VALUES ($1, NOW())',
    [data.job_id]
  );
}

Store processed job IDs and skip re-processing if you see the same job_id twice. This is the standard idempotency pattern for webhook consumers.

Webhook Trade-offs

Pros of webhooks:

  • Real-time notification — no polling delay
  • No wasted requests on checking status
  • Scales well: 1,000 concurrent jobs generate 1,000 delivery calls, not N × 1,000 polling calls
  • Your server is free while the PDF generates

Cons of webhooks:

  • Requires a publicly accessible HTTP endpoint
  • Not suitable for CLI scripts or one-off jobs
  • More complex to set up (endpoint, signature verification, retry handling)
  • Local development requires a tunnel (ngrok, etc.)
  • You must handle idempotency (duplicate deliveries)

Webhooks are the right choice for production applications where you are processing many PDFs and want real-time status updates with minimal API load.

Choosing Between Polling and Webhooks

Situation Recommendation
Script or CLI tool Polling
Serverless function triggered per request Polling
Background worker processing many jobs Webhooks
Real-time user notification required Webhooks
Local development or testing Polling (simpler)
High concurrency (100+ concurrent jobs) Webhooks
Public endpoint not available Polling

For most production web applications, webhooks are preferable. For scripting and tooling, polling is simpler. Both strategies compose well: use polling in development and webhooks in production.

Strategy 3: SSE (Server-Sent Events)

The May 31, 2026 release added SSE endpoints as a third option alongside polling and webhooks.

How SSE Works

1. POST /pdf/generate-async  →  response header contains X-Job-Id
2. GET /api/pdf/jobs/{jobId}/events  →  connect with Accept: text/event-stream
3. Server pushes progress events for each processing phase
4. Stream closes automatically on done or failed

Your browser subscribes directly using the native EventSource API — no public HTTP server required on your side.

SSE Characteristics

  • One-way stream from server to browser — no need for WebSocket-style bidirectional communication
  • Lower latency than polling — events arrive at each phase boundary, no polling interval overhead
  • No receiving server required — unlike webhooks, the browser subscribes directly via EventSource
  • Fine-grained phase progress — receive queued / preprocessing / rendering / postprocess / uploading / done events in real time
  • Built-in reconnection safety — browsers automatically send Last-Event-ID on reconnect, so no events are missed

Browser Implementation (EventSource)

// Step 1: Submit a PDF job to get a job_id
const submitResp = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate-async', {
  method: 'POST',
  headers: {
    'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ html, options: { format: 'A4' } }),
});
const { job_id } = await submitResp.json();

// Step 2: Subscribe to progress via SSE
const es = new EventSource(`/api/pdf/jobs/${job_id}/events`);

es.addEventListener('progress', (event) => {
  const { phase, progress, message } = JSON.parse(event.data);

  // Update a progress bar (0.0 to 1.0)
  updateProgressBar(progress);
  showStatus(phase, message);

  if (phase === 'done' || phase === 'failed') {
    es.close();
  }
});

For full implementation details including curl, Node.js, and SDK examples, see the PDF Real-Time Progress Streaming Guide.

Decision Matrix: All Three Strategies

Scenario Recommendation
Browser UI with live progress bar SSE
Server-to-server reliable delivery with idempotency Webhook
Simple job check, CLI script, or cron job Polling
No receiving server available SSE or Polling
Long-running batch (minutes to tens of minutes) SSE + Webhook combined
High concurrency (100+ simultaneous jobs) Webhook (SSE has connection limits)

Combining SSE and Webhooks: SSE handles the in-browser experience while the job runs; webhooks handle reliable backend processing after completion. Using both together is the recommended production pattern.

Hybrid: Polling with Webhook Fallback

For critical jobs where you cannot miss a completion event, combine both:

async function generatePdfReliably(html) {
  const { job_id } = await submitJob(html);

  // Primary: wait for webhook (handled separately)
  // Fallback: poll in case webhook delivery fails
  const WEBHOOK_TIMEOUT_MS = 60_000; // 60 seconds
  const startTime = Date.now();

  while (Date.now() - startTime < WEBHOOK_TIMEOUT_MS) {
    const { status, url } = await checkStatus(job_id);

    if (status === 'done') return url;
    if (status === 'failed') throw new Error('PDF generation failed');

    await new Promise(r => setTimeout(r, 5_000)); // 5-second poll interval
  }

  throw new Error(`Job ${job_id} did not complete within timeout`);
}

In this pattern, the webhook handler updates a shared store (database or cache) when it fires. The polling loop reads from that store. Whichever arrives first wins, and the other is a no-op.

Error Handling and Retry

Both polling and webhook handlers should implement retry logic for transient failures.

async function downloadPdfWithRetry(url, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url);
      if (!response.ok) throw new Error(`HTTP ${response.status}`);
      return Buffer.from(await response.arrayBuffer());
    } catch (err) {
      if (attempt === maxRetries) throw err;
      // Exponential backoff: 1s, 2s, 4s
      await new Promise(r => setTimeout(r, 1000 * 2 ** (attempt - 1)));
    }
  }
}

For comprehensive error handling patterns including rate-limit handling and 5xx recovery, see the PDF API error handling guide.

Summary

Polling Webhooks SSE
Setup complexity Low Medium Low–Medium
Real-time notification No (interval delay) Yes Yes
Requires public endpoint No Yes No
Wasted API calls Yes (checking status) No No
Works directly in browser Yes No Yes
Intermediate progress (%) No No Yes
Good for scripts/CLI Yes No Possible
Good for production apps Depends on concurrency Yes Ideal for UI
Idempotency handling Not needed Required Not needed

Start with polling for simplicity. Use SSE when you need a live progress bar in the browser. Move to webhooks when you have multiple concurrent jobs, need reliable server-to-server delivery, or want to minimize unnecessary API calls. For production, combining SSE for the UI experience and webhooks for backend processing is the recommended pattern.

Related

Powered by FUNBREW PDF