Getting PDF API integration to work in development is one thing. Keeping it stable in production is another. APIs that behave perfectly in testing can start timing out under traffic spikes, rate limit violations can halt batch jobs mid-run, and unexpected costs can show up at the end of the month.
This article is a production checklist for teams deploying FUNBREW PDF or any PDF generation API to production. It integrates and extends the detailed guidance from three companion articles — error handling, security, and batch processing — into a single pre-launch reference.
If you are new to PDF APIs, start with the HTML to PDF complete guide or the quickstart by language.
1. API Key Management and Rotation
Your API key is the credential to your PDF generation service. A leaked key means unauthorized usage billed to your account and potential exposure of generated documents.
Separate Keys per Environment
# Production
FUNBREW_PDF_API_KEY=sk-prod-xxxxxxxxxxxxxxxxxxxx
# Staging
FUNBREW_PDF_API_KEY=sk-stg-xxxxxxxxxxxxxxxxxxxx
# Local development
FUNBREW_PDF_API_KEY=sk-dev-xxxxxxxxxxxxxxxxxxxx
Sharing one key across environments means developer laptops consume production quota and mistakes in local testing can affect production state.
Secret Management Tools
.env files work fine for small teams. As the team grows, adopt a dedicated secrets manager.
| Tool | Best for |
|---|---|
| AWS Secrets Manager | Applications hosted on AWS |
| HashiCorp Vault | Multi-cloud or on-premise environments |
| Doppler | Small-to-medium teams wanting centralized secrets |
| GitHub Actions Secrets | CI/CD pipelines only |
Automating Key Rotation
import boto3
import os
def rotate_pdf_api_key():
"""Rotate the API key using AWS Secrets Manager."""
client = boto3.client('secretsmanager', region_name='us-east-1')
# Provision a new key from your dashboard API
new_key = provision_new_api_key()
# Update Secrets Manager
client.put_secret_value(
SecretId='funbrew-pdf-api-key-prod',
SecretString=new_key,
)
# Invalidate the old key after an overlap window
print("API key rotation complete")
# Run on a 90-day schedule via CloudWatch Events
rotate_pdf_api_key()
Rotate every 90 days. When rotating, keep the old key valid for a few hours to avoid dropping in-flight requests during deployment.
For the full security guide including IP restrictions and input validation, see PDF API Security Guide.
Checklist
- Separate API keys for production, staging, and development
-
.envexcluded from version control via.gitignore - No API keys in frontend JavaScript
- 90-day rotation schedule established
- Key revocation procedure documented
2. Rate Limits and Application-Side Throttling
PDF generation APIs enforce per-minute and per-day request limits depending on your plan. Exceeding these returns 429 Too Many Requests, which halts your processing.
Inspect Your Rate Limit Headers
# Check rate limit status from response headers
curl -s -I -X POST "https://pdf.funbrew.cloud/api/v1/pdf/generate" \
-H "X-API-Key: $FUNBREW_PDF_API_KEY" \
-H "Content-Type: application/json" \
-d '{"html": "<h1>test</h1>"}' | grep -i "x-rate"
# Example response headers:
# X-RateLimit-Limit: 100
# X-RateLimit-Remaining: 87
# X-RateLimit-Reset: 1743465600
Implement a Token Bucket on Your Side
Rather than hitting the API limit and handling 429 responses reactively, throttle proactively from your application.
// Token bucket rate limiter (Node.js)
class RateLimiter {
constructor(requestsPerMinute) {
this.tokens = requestsPerMinute;
this.maxTokens = requestsPerMinute;
this.refillRate = requestsPerMinute / 60; // tokens per second
this.lastRefill = Date.now();
}
async acquire() {
this._refill();
if (this.tokens < 1) {
const waitMs = (1 - this.tokens) / this.refillRate * 1000;
await new Promise(resolve => setTimeout(resolve, waitMs));
this._refill();
}
this.tokens -= 1;
}
_refill() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(
this.maxTokens,
this.tokens + elapsed * this.refillRate
);
this.lastRefill = now;
}
}
// Target 80% of the plan limit to preserve headroom for spikes
const limiter = new RateLimiter(80);
async function generatePdf(html) {
await limiter.acquire();
// call PDF API
}
Operating at 80% of your plan limit keeps a buffer for traffic spikes without triggering rate limit errors.
For plan-by-plan pricing and rate limit comparisons, see PDF API Pricing Comparison.
Checklist
- Rate limits (per-minute and per-day) for your current plan are documented
- Application-side throttling is implemented
- Peak request volume is within plan limits
-
X-RateLimit-Remainingheader is monitored
3. Error Handling and Retry Design
Production environments have transient failures. Network blips, API maintenance windows, and rendering timeouts happen. Your integration must handle them without losing data or crashing.
Classify Errors Before Retrying
RETRYABLE_STATUS_CODES = {408, 429, 500, 502, 503, 504}
NON_RETRYABLE_STATUS_CODES = {400, 401, 403, 404}
def should_retry(status_code: int) -> bool:
return status_code in RETRYABLE_STATUS_CODES
| Status Code | Meaning | Retry? | Action |
|---|---|---|---|
| 400 | Bad request | No | Fix HTML or options |
| 401 / 403 | Auth error | No | Check / regenerate API key |
| 408 | Timeout | Yes | Exponential backoff |
| 429 | Rate limited | Yes | Wait for Retry-After header value |
| 500 / 502 / 503 | Server error | Yes | Exponential backoff |
Exponential Backoff with Jitter
interface RetryConfig {
maxRetries: number;
initialDelayMs: number;
maxDelayMs: number;
backoffMultiplier: number;
}
const DEFAULT_RETRY_CONFIG: RetryConfig = {
maxRetries: 5,
initialDelayMs: 1000,
maxDelayMs: 60000,
backoffMultiplier: 2,
};
async function generatePdfWithRetry(
html: string,
apiKey: string,
config: RetryConfig = DEFAULT_RETRY_CONFIG
): Promise<Buffer> {
let delay = config.initialDelayMs;
for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json',
},
body: JSON.stringify({ html }),
signal: AbortSignal.timeout(120_000),
});
if (response.ok) {
return Buffer.from(await response.arrayBuffer());
}
const isRetryable = [408, 429, 500, 502, 503, 504].includes(response.status);
if (!isRetryable || attempt === config.maxRetries) {
throw new Error(`PDF generation failed: HTTP ${response.status}`);
}
// Respect Retry-After header for 429
const retryAfter = response.headers.get('retry-after');
const waitMs = retryAfter
? parseFloat(retryAfter) * 1000
: Math.min(delay + Math.random() * 1000, config.maxDelayMs);
console.warn(`Retry ${attempt + 1}/${config.maxRetries}: waiting ${(waitMs / 1000).toFixed(1)}s`);
await new Promise(resolve => setTimeout(resolve, waitMs));
delay = Math.min(delay * config.backoffMultiplier, config.maxDelayMs);
}
throw new Error('Max retries exceeded');
}
The full retry implementation in curl, Python, Node.js, and PHP is covered in the PDF API Error Handling Guide.
Checklist
- Retryable vs. non-retryable errors are classified
- Exponential backoff with jitter is implemented
- Max retry count and max wait time are capped
- Alerts fire when max retries are exhausted
- Request IDs are logged for traceability
4. Monitoring and Alerting
The goal is to detect problems before users report them. This requires tracking the right metrics and setting actionable alert thresholds.
Key Metrics to Track
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| PDF generation failure rate | > 1% | > 5% |
| p50 response time | > 5s | > 15s |
| p99 response time | > 30s | > 60s |
| Retry rate (per minute) | > 10% | > 30% |
RateLimit-Remaining |
< 30% | < 10% |
Sending Metrics to Datadog
import time
import functools
from datadog import statsd
def track_pdf_generation(func):
"""Decorator that auto-collects PDF generation metrics."""
@functools.wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter()
tags = ['service:pdf-generator', 'env:production']
try:
result = func(*args, **kwargs)
duration_ms = (time.perf_counter() - start) * 1000
statsd.histogram('pdf.generation.duration_ms', duration_ms, tags=tags)
statsd.increment('pdf.generation.success', tags=tags)
return result
except Exception as e:
duration_ms = (time.perf_counter() - start) * 1000
error_tags = tags + [f'error_type:{type(e).__name__}']
statsd.histogram('pdf.generation.duration_ms', duration_ms, tags=error_tags)
statsd.increment('pdf.generation.failure', tags=error_tags)
raise
return wrapper
@track_pdf_generation
def generate_invoice_pdf(customer_data):
import requests, os
response = requests.post(
'https://pdf.funbrew.cloud/api/v1/pdf/generate',
headers={'X-API-Key': os.environ['FUNBREW_PDF_API_KEY']},
json={'html': build_invoice_html(customer_data)},
timeout=120,
)
response.raise_for_status()
return response.content
Combining Monitoring with Webhooks
Webhook integration lets the API push completion and failure events to your server rather than polling. This simplifies async job tracking.
{
"event": "pdf.generation.failed",
"job_id": "job_abc123",
"timestamp": "2026-04-01T12:00:00Z",
"error": {
"code": "RENDER_TIMEOUT",
"message": "Rendering exceeded 60 seconds",
"html_size_bytes": 245120
}
}
Prometheus Alert Rules
groups:
- name: pdf-api
rules:
- alert: PdfGenerationHighFailureRate
expr: |
rate(pdf_generation_failure_total[5m]) /
rate(pdf_generation_total[5m]) > 0.05
for: 2m
labels:
severity: critical
annotations:
summary: "PDF generation failure rate exceeds 5%"
description: "Failure rate over last 5m: {{ $value | humanizePercentage }}"
- alert: PdfGenerationHighLatency
expr: histogram_quantile(0.99, pdf_generation_duration_ms_bucket) > 30000
for: 5m
labels:
severity: warning
annotations:
summary: "PDF generation p99 latency exceeds 30 seconds"
Checklist
- Generation failure rate tracked in real time
- Response time (p50 and p99) tracked
- Rate limit headroom alert configured
- On-call notification for critical failures
- Async job completion detected via webhook or polling
5. Cost Optimization
API costs scale linearly with request count. Eliminating redundant generation and batching requests together can meaningfully reduce spend.
Strategy 1: Cache Identical PDFs
For PDFs generated from static content (terms of service, standard agreements), caching is highly effective.
import hashlib
import redis
import requests
import os
class PdfCache:
def __init__(self, redis_client, ttl_seconds=86400):
self.redis = redis_client
self.ttl = ttl_seconds # Default: 24 hours
def get_cache_key(self, html: str, options: dict) -> str:
"""Generate a deterministic cache key from HTML and options."""
content = f"{html}{str(sorted(options.items()))}"
return f"pdf_cache:{hashlib.sha256(content.encode()).hexdigest()}"
def generate_with_cache(self, html: str, options: dict = None) -> bytes:
options = options or {}
key = self.get_cache_key(html, options)
cached = self.redis.get(key)
if cached:
return cached
response = requests.post(
'https://pdf.funbrew.cloud/api/v1/pdf/generate',
headers={'X-API-Key': os.environ['FUNBREW_PDF_API_KEY']},
json={'html': html, 'options': options},
timeout=120,
)
response.raise_for_status()
pdf_bytes = response.content
self.redis.setex(key, self.ttl, pdf_bytes)
return pdf_bytes
cache = PdfCache(redis.Redis(host='localhost', port=6379))
pdf = cache.generate_with_cache(
html='<h1>Terms of Service</h1><p>...</p>',
options={'format': 'A4'}
)
Strategy 2: Batch Multiple PDFs per Request
A single batch request generates multiple PDFs at once, reducing API call count. See the PDF Batch Processing Guide for the full implementation.
# One API call generates three PDFs
curl -X POST "https://pdf.funbrew.cloud/api/v1/pdf/generate" \
-H "X-API-Key: $FUNBREW_PDF_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"batch": [
{
"html": "<h1>Invoice #001</h1>",
"filename": "invoice-001.pdf",
"options": { "format": "A4" }
},
{
"html": "<h1>Invoice #002</h1>",
"filename": "invoice-002.pdf",
"options": { "format": "A4" }
},
{
"html": "<h1>Invoice #003</h1>",
"filename": "invoice-003.pdf",
"options": { "format": "A4" }
}
]
}'
Strategy 3: Regenerate Only When Data Changes
from datetime import datetime
class PdfGenerationRecord:
"""Track generation history and skip regeneration when data hasn't changed."""
def generate_if_outdated(
self,
record_id: str,
html: str,
data_updated_at: datetime,
) -> bytes:
last_generated = self._get_last_generated(record_id)
if last_generated and last_generated >= data_updated_at:
return self._get_cached_pdf(record_id)
pdf_bytes = self._call_pdf_api(html)
self._store(record_id, pdf_bytes, generated_at=datetime.utcnow())
return pdf_bytes
Estimated Savings
| Monthly PDF Volume | No Optimization | 30% Cache Hit | 50% Batch Reduction |
|---|---|---|---|
| 10,000 | Baseline | −3,000 requests | −5,000 requests |
| 100,000 | Baseline | −30,000 requests | −50,000 requests |
For plan pricing details, see PDF API Pricing Comparison.
Checklist
- Caching implemented for identical or rarely-changing PDFs
- Multiple PDFs batched into single requests where possible
- Unnecessary re-generation prevented by checking data change timestamps
- Monthly request count reviewed to confirm plan is appropriate
6. Scaling with Queues and Async Processing
Synchronous PDF generation (request → wait → response) works for small volumes. Under heavy load or batch jobs, queue-based async processing is more resilient.
When to Use Each Pattern
| Scenario | Recommended Pattern | Reason |
|---|---|---|
| User clicks "Download" | Synchronous (max 15s) | Immediate feedback required |
| Monthly invoice batch (1,000+) | Async + queue | Too slow to block a request |
| Scheduled report generation | Async + scheduler | Runs fully in background |
| Bulk certificate issuance | Async + batch | Minimizes API call count |
Redis Queue Pattern with BullMQ (Node.js)
import { Queue, Worker } from 'bullmq';
import { Redis } from 'ioredis';
const connection = new Redis({ host: 'localhost', port: 6379 });
const pdfQueue = new Queue('pdf-generation', { connection });
// Enqueue a job (called from your API endpoint)
export async function enqueuePdfGeneration(jobData) {
const job = await pdfQueue.add('generate', jobData, {
attempts: 5,
backoff: {
type: 'exponential',
delay: 1000,
},
removeOnComplete: { count: 1000 },
removeOnFail: { count: 500 },
});
return { jobId: job.id };
}
// Worker (horizontally scalable)
const worker = new Worker(
'pdf-generation',
async (job) => {
const { html, options, webhookUrl } = job.data;
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
method: 'POST',
headers: {
'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({ html, options }),
signal: AbortSignal.timeout(120_000),
});
if (!response.ok) {
throw new Error(`API error: HTTP ${response.status}`);
}
const pdfBuffer = Buffer.from(await response.arrayBuffer());
const downloadUrl = await uploadToStorage(pdfBuffer);
if (webhookUrl) {
await notifyWebhook(webhookUrl, { downloadUrl, jobId: job.id });
}
return { downloadUrl };
},
{
connection,
concurrency: 10, // Tune so total workers × concurrency stays within rate limit
}
);
worker.on('failed', (job, err) => {
console.error(`Job ${job?.id} failed:`, err.message);
// Trigger PagerDuty / Slack alert
});
Kubernetes Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: pdf-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: pdf-worker
minReplicas: 2
maxReplicas: 20
metrics:
- type: External
external:
metric:
name: bullmq_queue_size
selector:
matchLabels:
queue: pdf-generation
target:
type: AverageValue
averageValue: "50" # Max 50 queued jobs per worker pod
When scaling out workers, remember the API rate limit stays fixed. Keep workers × concurrency within your plan's per-minute limit.
Checklist
- Batch and bulk jobs processed via queue, not synchronous requests
- Worker concurrency tuned to stay within API rate limits
- Queue depth (backlog size) monitored
- Dead Letter Queue configured for failed jobs
- Worker auto-scaling (HPA or equivalent) verified
7. Security Checklist
Before going live, verify the following security controls. The full guide with code examples is in PDF API Security Guide.
Input Validation
// Escape all user input before embedding in HTML
function escapeHtml(str) {
return String(str)
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
// Enforce an HTML size limit (e.g., 1MB)
const MAX_HTML_SIZE_BYTES = 1_048_576;
function validateHtmlInput(html) {
if (Buffer.byteLength(html, 'utf8') > MAX_HTML_SIZE_BYTES) {
throw new Error('HTML is too large. Maximum size is 1MB.');
}
}
Security Checklist
| Category | Check Item | Priority |
|---|---|---|
| Auth | API keys in environment variables, not source code | Required |
| Auth | Separate keys for production, staging, and development | Required |
| Transport | HTTPS (TLS 1.2+) only | Required |
| Input | User input escaped before embedding in HTML | Required |
| Input | HTML size validated before sending to API | Recommended |
| Access | IP allowlisting to production servers only | Recommended |
| Access | API never called directly from frontend JavaScript | Required |
| Data | Auto-deletion policy for generated files confirmed | Recommended |
| Audit | All API calls logged with timestamps and user IDs | Recommended |
8. Pre-Launch Deployment Checklist
Use this as a PR template or release checklist before every production deployment.
Setup
- Production API key provisioned from the dashboard
- API key stored in secrets manager or environment variable, not in code
-
.envconfirmed absent from Git history - E2E tests passing on staging environment
Error Handling
- Request timeout set to 120 seconds or more
- Exponential backoff retry logic implemented
- Non-retryable errors trigger immediate alerts (no silent failures)
- Error logs include request ID, status code, and attempt count
Performance and Scaling
- Batch and bulk jobs use queue-based async processing
- Application-side throttling implemented
- Worker concurrency stays within API rate limits
- PDF caching implemented where appropriate
Monitoring and Alerting
- Generation failure rate alert configured (threshold: 5%)
- Response time alert configured (p99 > 30s)
- Rate limit headroom alert configured
- Queue depth monitored
- Monthly usage visible in dashboard
Security
- No API keys in frontend code
- User input HTML-escaped before PDF generation
- HTTPS (TLS 1.2+) enforced for all API calls
- IP allowlisting configured for production servers
Cost Management
- Monthly request volume estimated and within plan limits
- Caching or change-detection prevents redundant generation
- Monthly cost review process established
Conclusion
Moving from "it works" to "it works reliably in production" is the real work in PDF API integration. You do not need to implement everything at once. Start with the essentials — API key management, error handling, and basic monitoring — then layer in throttling, batching, caching, and queue-based scaling as traffic grows.
Each topic in this checklist has a dedicated deep-dive:
- Error handling: PDF API Error Handling Guide
- Security: PDF API Security Guide
- Batch processing: PDF Batch Processing Guide
- Webhook integration: PDF API Webhook Integration
- Pricing: PDF API Pricing Comparison
Try the API in the Playground, review the full API documentation, and explore real-world implementations in the use cases section.