Async PDF Generation Status: Polling vs Webhook
Synchronous PDF generation — POST HTML, wait for the response, receive PDF bytes — works well for simple documents. For large batches, complex templates, or high-concurrency workloads, asynchronous generation is the better model: submit a job, then track its status separately.
There are two strategies for tracking async job status: polling (you check the API repeatedly until the job is done) and webhooks (the API notifies your endpoint when the job completes). Each has different trade-offs.
This article covers both strategies with working code. For full webhook setup including Slack integration, see the PDF API Webhook Integration Guide. For batch processing patterns, see the PDF API Batch Processing Guide.
Why Async PDF Generation
Synchronous generation is fine when:
- You are generating a single PDF per request
- The generation time is under 5 seconds
- The client (browser or mobile app) can wait for the response
Async generation becomes necessary when:
- Templates are complex and rendering takes 10+ seconds
- You are generating hundreds or thousands of PDFs at once
- You want to decouple PDF generation from the HTTP request lifecycle
- You need to retry failed jobs without re-running the full request
For batch jobs, the PDF API batch processing guide covers concurrency control and queue patterns.
Strategy 1: Polling
How Polling Works
1. POST /pdf/generate → { job_id: "pdf_abc123", status: "queued" }
2. GET /pdf/status/pdf_abc123 → { status: "processing" }
3. GET /pdf/status/pdf_abc123 → { status: "processing" }
4. GET /pdf/status/pdf_abc123 → { status: "done", url: "https://..." }
Your code checks the status endpoint at regular intervals until the job completes or fails.
Basic Polling (Node.js)
async function pollUntilDone(jobId, intervalMs = 2000, maxAttempts = 30) {
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
const response = await fetch(
`https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
{
headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY },
}
);
if (!response.ok) throw new Error(`Status check failed: ${response.status}`);
const { status, url, error } = await response.json();
if (status === 'done') return url;
if (status === 'failed') throw new Error(`PDF generation failed: ${error}`);
// Still processing — wait before next check
if (attempt < maxAttempts) {
await new Promise(r => setTimeout(r, intervalMs));
}
}
throw new Error(`PDF job ${jobId} did not complete after ${maxAttempts} attempts`);
}
// Usage
async function generateAndWait(html) {
// Step 1: Submit the job
const submitResp = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate-async', {
method: 'POST',
headers: {
'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({ html, options: { format: 'A4' } }),
});
const { job_id } = await submitResp.json();
// Step 2: Poll until done
const pdfUrl = await pollUntilDone(job_id);
return pdfUrl;
}
Exponential Backoff Polling
A fixed interval wastes requests for long-running jobs and adds latency for quick ones. Exponential backoff adjusts the wait time dynamically.
async function pollWithBackoff(jobId, {
initialIntervalMs = 500,
maxIntervalMs = 10000,
maxAttempts = 20,
} = {}) {
let intervalMs = initialIntervalMs;
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
const response = await fetch(
`https://pdf.funbrew.cloud/api/v1/pdf/status/${jobId}`,
{ headers: { 'X-API-Key': process.env.FUNBREW_PDF_API_KEY } }
);
const { status, url, error } = await response.json();
if (status === 'done') return url;
if (status === 'failed') throw new Error(error);
// Exponential backoff with jitter
const jitter = Math.random() * 200;
await new Promise(r => setTimeout(r, intervalMs + jitter));
// Double the interval up to the cap
intervalMs = Math.min(intervalMs * 2, maxIntervalMs);
}
throw new Error(`Polling timed out after ${maxAttempts} attempts`);
}
The jitter (random offset) prevents multiple clients from all checking at the same moment — important when polling from multiple workers.
Polling in Python
import os
import time
import requests
def poll_until_done(
job_id: str,
initial_interval: float = 0.5,
max_interval: float = 10.0,
max_attempts: int = 20,
) -> str:
interval = initial_interval
api_key = os.environ["FUNBREW_PDF_API_KEY"]
for attempt in range(1, max_attempts + 1):
resp = requests.get(
f"https://pdf.funbrew.cloud/api/v1/pdf/status/{job_id}",
headers={"X-API-Key": api_key},
timeout=10,
)
resp.raise_for_status()
data = resp.json()
if data["status"] == "done":
return data["url"]
if data["status"] == "failed":
raise RuntimeError(f"PDF job failed: {data.get('error')}")
if attempt < max_attempts:
time.sleep(interval)
interval = min(interval * 2, max_interval)
raise TimeoutError(f"Job {job_id} did not complete in {max_attempts} attempts")
Polling Trade-offs
Pros of polling:
- Simple to implement — no public endpoint needed
- Works in any environment (serverless, CLI scripts, cron jobs)
- Easy to debug — you can see every status check in logs
- No webhook infrastructure required
Cons of polling:
- Wasted requests when the job takes longer than expected
- Added latency equal to one polling interval after job completion
- At high concurrency, many simultaneous pollers increase API load
- Keeping a process alive during polling ties up server resources
Polling is the right choice for low-to-medium concurrency use cases, CLI tools, scripts, and any environment where receiving incoming HTTP requests is not possible.
Strategy 2: Webhooks
How Webhooks Work
1. Configure a webhook URL in the FUNBREW PDF dashboard
2. POST /pdf/generate → { job_id: "pdf_abc123" } (returns immediately)
3. [API generates PDF in background]
4. API calls your endpoint: POST /webhooks/pdf → { event: "pdf.generated", data: {...} }
Your code receives a push notification when the job completes, rather than checking repeatedly.
Webhook Endpoint (Node.js / Express)
const express = require('express');
const crypto = require('crypto');
const app = express();
app.use(express.json());
app.post('/webhooks/pdf', (req, res) => {
// Step 1: Verify the signature
const signature = req.headers['x-funbrew-signature'];
const expected = crypto
.createHmac('sha256', process.env.WEBHOOK_SECRET)
.update(JSON.stringify(req.body))
.digest('hex');
if (!crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))) {
return res.status(401).json({ error: 'Invalid signature' });
}
// Step 2: Acknowledge receipt immediately (before processing)
res.status(200).json({ received: true });
// Step 3: Handle the event asynchronously
const { event, data } = req.body;
handlePdfEvent(event, data).catch(err => console.error('Webhook handler error:', err));
});
async function handlePdfEvent(event, data) {
switch (event) {
case 'pdf.generated':
console.log(`PDF ready: ${data.filename} (${data.file_size} bytes)`);
// Download, store, or notify the user
await notifyUser(data.job_id, data.url);
break;
case 'pdf.failed':
console.error(`PDF failed: ${data.job_id} — ${data.error}`);
await recordFailure(data.job_id, data.error);
break;
case 'batch.completed':
console.log(`Batch done: ${data.succeeded}/${data.total} succeeded`);
break;
}
}
app.listen(3000);
Webhook Endpoint (Python / FastAPI)
import hmac
import hashlib
import os
from fastapi import FastAPI, Request, HTTPException, BackgroundTasks
app = FastAPI()
@app.post("/webhooks/pdf")
async def handle_webhook(request: Request, background_tasks: BackgroundTasks):
body = await request.body()
signature = request.headers.get("x-funbrew-signature", "")
# Verify signature
expected = hmac.new(
os.environ["WEBHOOK_SECRET"].encode(),
body,
hashlib.sha256,
).hexdigest()
if not hmac.compare_digest(signature, expected):
raise HTTPException(status_code=401, detail="Invalid signature")
# Acknowledge immediately
payload = await request.json()
background_tasks.add_task(process_event, payload["event"], payload["data"])
return {"received": True}
async def process_event(event: str, data: dict):
if event == "pdf.generated":
print(f"PDF ready: {data['filename']}")
await notify_user(data["job_id"], data["url"])
elif event == "pdf.failed":
print(f"PDF failed: {data['job_id']} — {data['error']}")
Signature Verification
Always verify the webhook signature before processing the payload. Without verification, anyone who knows your endpoint URL can send fake events.
// Correct: use timing-safe comparison
const isValid = crypto.timingSafeEqual(
Buffer.from(signature, 'hex'),
Buffer.from(expected, 'hex')
);
// Wrong: use === (vulnerable to timing attacks)
// const isValid = signature === expected;
Using timingSafeEqual (Node.js) or hmac.compare_digest (Python) prevents timing attacks where an attacker could infer the expected signature by measuring response time.
Handling Webhook Retries
The FUNBREW PDF API retries webhook delivery if your endpoint returns a non-2xx status code or times out. Your endpoint must handle duplicate deliveries gracefully.
// Idempotent handler: check if the job was already processed
async function handlePdfEvent(event, data) {
if (event !== 'pdf.generated') return;
// Check if already processed
const existing = await db.query(
'SELECT id FROM processed_pdfs WHERE job_id = $1',
[data.job_id]
);
if (existing.rows.length > 0) {
console.log(`Job ${data.job_id} already processed — skipping`);
return;
}
// Process and mark as done
await downloadAndStorePdf(data.url, data.filename);
await db.query(
'INSERT INTO processed_pdfs (job_id, processed_at) VALUES ($1, NOW())',
[data.job_id]
);
}
Store processed job IDs and skip re-processing if you see the same job_id twice. This is the standard idempotency pattern for webhook consumers.
Webhook Trade-offs
Pros of webhooks:
- Real-time notification — no polling delay
- No wasted requests on checking status
- Scales well: 1,000 concurrent jobs generate 1,000 delivery calls, not N × 1,000 polling calls
- Your server is free while the PDF generates
Cons of webhooks:
- Requires a publicly accessible HTTP endpoint
- Not suitable for CLI scripts or one-off jobs
- More complex to set up (endpoint, signature verification, retry handling)
- Local development requires a tunnel (ngrok, etc.)
- You must handle idempotency (duplicate deliveries)
Webhooks are the right choice for production applications where you are processing many PDFs and want real-time status updates with minimal API load.
Choosing Between Polling and Webhooks
| Situation | Recommendation |
|---|---|
| Script or CLI tool | Polling |
| Serverless function triggered per request | Polling |
| Background worker processing many jobs | Webhooks |
| Real-time user notification required | Webhooks |
| Local development or testing | Polling (simpler) |
| High concurrency (100+ concurrent jobs) | Webhooks |
| Public endpoint not available | Polling |
For most production web applications, webhooks are preferable. For scripting and tooling, polling is simpler. Both strategies compose well: use polling in development and webhooks in production.
Hybrid: Polling with Webhook Fallback
For critical jobs where you cannot miss a completion event, combine both:
async function generatePdfReliably(html) {
const { job_id } = await submitJob(html);
// Primary: wait for webhook (handled separately)
// Fallback: poll in case webhook delivery fails
const WEBHOOK_TIMEOUT_MS = 60_000; // 60 seconds
const startTime = Date.now();
while (Date.now() - startTime < WEBHOOK_TIMEOUT_MS) {
const { status, url } = await checkStatus(job_id);
if (status === 'done') return url;
if (status === 'failed') throw new Error('PDF generation failed');
await new Promise(r => setTimeout(r, 5_000)); // 5-second poll interval
}
throw new Error(`Job ${job_id} did not complete within timeout`);
}
In this pattern, the webhook handler updates a shared store (database or cache) when it fires. The polling loop reads from that store. Whichever arrives first wins, and the other is a no-op.
Error Handling and Retry
Both polling and webhook handlers should implement retry logic for transient failures.
async function downloadPdfWithRetry(url, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const response = await fetch(url);
if (!response.ok) throw new Error(`HTTP ${response.status}`);
return Buffer.from(await response.arrayBuffer());
} catch (err) {
if (attempt === maxRetries) throw err;
// Exponential backoff: 1s, 2s, 4s
await new Promise(r => setTimeout(r, 1000 * 2 ** (attempt - 1)));
}
}
}
For comprehensive error handling patterns including rate-limit handling and 5xx recovery, see the PDF API error handling guide.
Summary
| Polling | Webhooks | |
|---|---|---|
| Setup complexity | Low | Medium |
| Real-time notification | No (interval delay) | Yes |
| Requires public endpoint | No | Yes |
| Wasted API calls | Yes (checking status) | No |
| Good for scripts/CLI | Yes | No |
| Good for production apps | Depends on concurrency | Yes |
| Idempotency handling | Not needed | Required |
Start with polling for simplicity. Move to webhooks when you have multiple concurrent jobs, need real-time status in a production app, or want to minimize unnecessary API calls.
Related
- PDF API Webhook Integration Guide — Full webhook setup with Slack notifications and PHP/Python/Node.js endpoints
- PDF API Batch Processing Guide — Concurrency control and queue patterns for generating many PDFs
- PDF API Error Handling Guide — Retry logic, rate-limit handling, and resilience patterns
- PDF API Production Guide — Production hardening checklist
- PDF API Quickstart by Language — First API call in Node.js, Python, PHP, Ruby, Go