Invalid Date

Getting PDF API integration to work in development is one thing. Keeping it stable in production is another. APIs that behave perfectly in testing can start timing out under traffic spikes, rate limit violations can halt batch jobs mid-run, and unexpected costs can show up at the end of the month.

This article is a production checklist for teams deploying FUNBREW PDF or any PDF generation API to production. It integrates and extends the detailed guidance from three companion articles — error handling, security, and batch processing — into a single pre-launch reference.

If you are new to PDF APIs, start with the HTML to PDF complete guide or the quickstart by language.


1. API Key Management and Rotation

Your API key is the credential to your PDF generation service. A leaked key means unauthorized usage billed to your account and potential exposure of generated documents.

Separate Keys per Environment

# Production
FUNBREW_PDF_API_KEY=sk-prod-xxxxxxxxxxxxxxxxxxxx

# Staging
FUNBREW_PDF_API_KEY=sk-stg-xxxxxxxxxxxxxxxxxxxx

# Local development
FUNBREW_PDF_API_KEY=sk-dev-xxxxxxxxxxxxxxxxxxxx

Sharing one key across environments means developer laptops consume production quota and mistakes in local testing can affect production state.

Secret Management Tools

.env files work fine for small teams. As the team grows, adopt a dedicated secrets manager.

Tool Best for
AWS Secrets Manager Applications hosted on AWS
HashiCorp Vault Multi-cloud or on-premise environments
Doppler Small-to-medium teams wanting centralized secrets
GitHub Actions Secrets CI/CD pipelines only

Automating Key Rotation

import boto3
import os

def rotate_pdf_api_key():
    """Rotate the API key using AWS Secrets Manager."""
    client = boto3.client('secretsmanager', region_name='us-east-1')
    
    # Provision a new key from your dashboard API
    new_key = provision_new_api_key()
    
    # Update Secrets Manager
    client.put_secret_value(
        SecretId='funbrew-pdf-api-key-prod',
        SecretString=new_key,
    )
    
    # Invalidate the old key after an overlap window
    print("API key rotation complete")

# Run on a 90-day schedule via CloudWatch Events
rotate_pdf_api_key()

Rotate every 90 days. When rotating, keep the old key valid for a few hours to avoid dropping in-flight requests during deployment.

For the full security guide including IP restrictions and input validation, see PDF API Security Guide.

Checklist

  • Separate API keys for production, staging, and development
  • .env excluded from version control via .gitignore
  • No API keys in frontend JavaScript
  • 90-day rotation schedule established
  • Key revocation procedure documented

2. Rate Limits and Application-Side Throttling

PDF generation APIs enforce per-minute and per-day request limits depending on your plan. Exceeding these returns 429 Too Many Requests, which halts your processing.

Inspect Your Rate Limit Headers

# Check rate limit status from response headers
curl -s -I -X POST "https://pdf.funbrew.cloud/api/v1/pdf/generate" \
  -H "X-API-Key: $FUNBREW_PDF_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"html": "<h1>test</h1>"}' | grep -i "x-rate"

# Example response headers:
# X-RateLimit-Limit: 100
# X-RateLimit-Remaining: 87
# X-RateLimit-Reset: 1743465600

Implement a Token Bucket on Your Side

Rather than hitting the API limit and handling 429 responses reactively, throttle proactively from your application.

// Token bucket rate limiter (Node.js)
class RateLimiter {
  constructor(requestsPerMinute) {
    this.tokens = requestsPerMinute;
    this.maxTokens = requestsPerMinute;
    this.refillRate = requestsPerMinute / 60; // tokens per second
    this.lastRefill = Date.now();
  }

  async acquire() {
    this._refill();

    if (this.tokens < 1) {
      const waitMs = (1 - this.tokens) / this.refillRate * 1000;
      await new Promise(resolve => setTimeout(resolve, waitMs));
      this._refill();
    }

    this.tokens -= 1;
  }

  _refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(
      this.maxTokens,
      this.tokens + elapsed * this.refillRate
    );
    this.lastRefill = now;
  }
}

// Target 80% of the plan limit to preserve headroom for spikes
const limiter = new RateLimiter(80);

async function generatePdf(html) {
  await limiter.acquire();
  // call PDF API
}

Operating at 80% of your plan limit keeps a buffer for traffic spikes without triggering rate limit errors.

For plan-by-plan pricing and rate limit comparisons, see PDF API Pricing Comparison.

Checklist

  • Rate limits (per-minute and per-day) for your current plan are documented
  • Application-side throttling is implemented
  • Peak request volume is within plan limits
  • X-RateLimit-Remaining header is monitored

3. Error Handling and Retry Design

Production environments have transient failures. Network blips, API maintenance windows, and rendering timeouts happen. Your integration must handle them without losing data or crashing.

Classify Errors Before Retrying

RETRYABLE_STATUS_CODES = {408, 429, 500, 502, 503, 504}
NON_RETRYABLE_STATUS_CODES = {400, 401, 403, 404}

def should_retry(status_code: int) -> bool:
    return status_code in RETRYABLE_STATUS_CODES
Status Code Meaning Retry? Action
400 Bad request No Fix HTML or options
401 / 403 Auth error No Check / regenerate API key
408 Timeout Yes Exponential backoff
429 Rate limited Yes Wait for Retry-After header value
500 / 502 / 503 Server error Yes Exponential backoff

Exponential Backoff with Jitter

interface RetryConfig {
  maxRetries: number;
  initialDelayMs: number;
  maxDelayMs: number;
  backoffMultiplier: number;
}

const DEFAULT_RETRY_CONFIG: RetryConfig = {
  maxRetries: 5,
  initialDelayMs: 1000,
  maxDelayMs: 60000,
  backoffMultiplier: 2,
};

async function generatePdfWithRetry(
  html: string,
  apiKey: string,
  config: RetryConfig = DEFAULT_RETRY_CONFIG
): Promise<Buffer> {
  let delay = config.initialDelayMs;

  for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
    const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
      method: 'POST',
      headers: {
        'X-API-Key': apiKey,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({ html }),
      signal: AbortSignal.timeout(120_000),
    });

    if (response.ok) {
      return Buffer.from(await response.arrayBuffer());
    }

    const isRetryable = [408, 429, 500, 502, 503, 504].includes(response.status);
    if (!isRetryable || attempt === config.maxRetries) {
      throw new Error(`PDF generation failed: HTTP ${response.status}`);
    }

    // Respect Retry-After header for 429
    const retryAfter = response.headers.get('retry-after');
    const waitMs = retryAfter
      ? parseFloat(retryAfter) * 1000
      : Math.min(delay + Math.random() * 1000, config.maxDelayMs);

    console.warn(`Retry ${attempt + 1}/${config.maxRetries}: waiting ${(waitMs / 1000).toFixed(1)}s`);
    await new Promise(resolve => setTimeout(resolve, waitMs));
    delay = Math.min(delay * config.backoffMultiplier, config.maxDelayMs);
  }

  throw new Error('Max retries exceeded');
}

The full retry implementation in curl, Python, Node.js, and PHP is covered in the PDF API Error Handling Guide.

Checklist

  • Retryable vs. non-retryable errors are classified
  • Exponential backoff with jitter is implemented
  • Max retry count and max wait time are capped
  • Alerts fire when max retries are exhausted
  • Request IDs are logged for traceability

4. Monitoring and Alerting

The goal is to detect problems before users report them. This requires tracking the right metrics and setting actionable alert thresholds.

Key Metrics to Track

Metric Warning Threshold Critical Threshold
PDF generation failure rate > 1% > 5%
p50 response time > 5s > 15s
p99 response time > 30s > 60s
Retry rate (per minute) > 10% > 30%
RateLimit-Remaining < 30% < 10%

Sending Metrics to Datadog

import time
import functools
from datadog import statsd

def track_pdf_generation(func):
    """Decorator that auto-collects PDF generation metrics."""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        tags = ['service:pdf-generator', 'env:production']

        try:
            result = func(*args, **kwargs)
            duration_ms = (time.perf_counter() - start) * 1000

            statsd.histogram('pdf.generation.duration_ms', duration_ms, tags=tags)
            statsd.increment('pdf.generation.success', tags=tags)
            return result

        except Exception as e:
            duration_ms = (time.perf_counter() - start) * 1000
            error_tags = tags + [f'error_type:{type(e).__name__}']

            statsd.histogram('pdf.generation.duration_ms', duration_ms, tags=error_tags)
            statsd.increment('pdf.generation.failure', tags=error_tags)
            raise

    return wrapper

@track_pdf_generation
def generate_invoice_pdf(customer_data):
    import requests, os
    response = requests.post(
        'https://pdf.funbrew.cloud/api/v1/pdf/generate',
        headers={'X-API-Key': os.environ['FUNBREW_PDF_API_KEY']},
        json={'html': build_invoice_html(customer_data)},
        timeout=120,
    )
    response.raise_for_status()
    return response.content

Combining Monitoring with Webhooks

Webhook integration lets the API push completion and failure events to your server rather than polling. This simplifies async job tracking.

{
  "event": "pdf.generation.failed",
  "job_id": "job_abc123",
  "timestamp": "2026-04-01T12:00:00Z",
  "error": {
    "code": "RENDER_TIMEOUT",
    "message": "Rendering exceeded 60 seconds",
    "html_size_bytes": 245120
  }
}

Prometheus Alert Rules

groups:
  - name: pdf-api
    rules:
      - alert: PdfGenerationHighFailureRate
        expr: |
          rate(pdf_generation_failure_total[5m]) /
          rate(pdf_generation_total[5m]) > 0.05
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "PDF generation failure rate exceeds 5%"
          description: "Failure rate over last 5m: {{ $value | humanizePercentage }}"

      - alert: PdfGenerationHighLatency
        expr: histogram_quantile(0.99, pdf_generation_duration_ms_bucket) > 30000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PDF generation p99 latency exceeds 30 seconds"

Checklist

  • Generation failure rate tracked in real time
  • Response time (p50 and p99) tracked
  • Rate limit headroom alert configured
  • On-call notification for critical failures
  • Async job completion detected via webhook or polling

5. Cost Optimization

API costs scale linearly with request count. Eliminating redundant generation and batching requests together can meaningfully reduce spend.

Strategy 1: Cache Identical PDFs

For PDFs generated from static content (terms of service, standard agreements), caching is highly effective.

import hashlib
import redis
import requests
import os

class PdfCache:
    def __init__(self, redis_client, ttl_seconds=86400):
        self.redis = redis_client
        self.ttl = ttl_seconds  # Default: 24 hours

    def get_cache_key(self, html: str, options: dict) -> str:
        """Generate a deterministic cache key from HTML and options."""
        content = f"{html}{str(sorted(options.items()))}"
        return f"pdf_cache:{hashlib.sha256(content.encode()).hexdigest()}"

    def generate_with_cache(self, html: str, options: dict = None) -> bytes:
        options = options or {}
        key = self.get_cache_key(html, options)

        cached = self.redis.get(key)
        if cached:
            return cached

        response = requests.post(
            'https://pdf.funbrew.cloud/api/v1/pdf/generate',
            headers={'X-API-Key': os.environ['FUNBREW_PDF_API_KEY']},
            json={'html': html, 'options': options},
            timeout=120,
        )
        response.raise_for_status()

        pdf_bytes = response.content
        self.redis.setex(key, self.ttl, pdf_bytes)
        return pdf_bytes

cache = PdfCache(redis.Redis(host='localhost', port=6379))
pdf = cache.generate_with_cache(
    html='<h1>Terms of Service</h1><p>...</p>',
    options={'format': 'A4'}
)

Strategy 2: Batch Multiple PDFs per Request

A single batch request generates multiple PDFs at once, reducing API call count. See the PDF Batch Processing Guide for the full implementation.

# One API call generates three PDFs
curl -X POST "https://pdf.funbrew.cloud/api/v1/pdf/generate" \
  -H "X-API-Key: $FUNBREW_PDF_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "batch": [
      {
        "html": "<h1>Invoice #001</h1>",
        "filename": "invoice-001.pdf",
        "options": { "format": "A4" }
      },
      {
        "html": "<h1>Invoice #002</h1>",
        "filename": "invoice-002.pdf",
        "options": { "format": "A4" }
      },
      {
        "html": "<h1>Invoice #003</h1>",
        "filename": "invoice-003.pdf",
        "options": { "format": "A4" }
      }
    ]
  }'

Strategy 3: Regenerate Only When Data Changes

from datetime import datetime

class PdfGenerationRecord:
    """Track generation history and skip regeneration when data hasn't changed."""

    def generate_if_outdated(
        self,
        record_id: str,
        html: str,
        data_updated_at: datetime,
    ) -> bytes:
        last_generated = self._get_last_generated(record_id)

        if last_generated and last_generated >= data_updated_at:
            return self._get_cached_pdf(record_id)

        pdf_bytes = self._call_pdf_api(html)
        self._store(record_id, pdf_bytes, generated_at=datetime.utcnow())
        return pdf_bytes

Estimated Savings

Monthly PDF Volume No Optimization 30% Cache Hit 50% Batch Reduction
10,000 Baseline −3,000 requests −5,000 requests
100,000 Baseline −30,000 requests −50,000 requests

For plan pricing details, see PDF API Pricing Comparison.

Checklist

  • Caching implemented for identical or rarely-changing PDFs
  • Multiple PDFs batched into single requests where possible
  • Unnecessary re-generation prevented by checking data change timestamps
  • Monthly request count reviewed to confirm plan is appropriate

6. Scaling with Queues and Async Processing

Synchronous PDF generation (request → wait → response) works for small volumes. Under heavy load or batch jobs, queue-based async processing is more resilient.

When to Use Each Pattern

Scenario Recommended Pattern Reason
User clicks "Download" Synchronous (max 15s) Immediate feedback required
Monthly invoice batch (1,000+) Async + queue Too slow to block a request
Scheduled report generation Async + scheduler Runs fully in background
Bulk certificate issuance Async + batch Minimizes API call count

Redis Queue Pattern with BullMQ (Node.js)

import { Queue, Worker } from 'bullmq';
import { Redis } from 'ioredis';

const connection = new Redis({ host: 'localhost', port: 6379 });

const pdfQueue = new Queue('pdf-generation', { connection });

// Enqueue a job (called from your API endpoint)
export async function enqueuePdfGeneration(jobData) {
  const job = await pdfQueue.add('generate', jobData, {
    attempts: 5,
    backoff: {
      type: 'exponential',
      delay: 1000,
    },
    removeOnComplete: { count: 1000 },
    removeOnFail: { count: 500 },
  });

  return { jobId: job.id };
}

// Worker (horizontally scalable)
const worker = new Worker(
  'pdf-generation',
  async (job) => {
    const { html, options, webhookUrl } = job.data;

    const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf/generate', {
      method: 'POST',
      headers: {
        'X-API-Key': process.env.FUNBREW_PDF_API_KEY,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({ html, options }),
      signal: AbortSignal.timeout(120_000),
    });

    if (!response.ok) {
      throw new Error(`API error: HTTP ${response.status}`);
    }

    const pdfBuffer = Buffer.from(await response.arrayBuffer());
    const downloadUrl = await uploadToStorage(pdfBuffer);

    if (webhookUrl) {
      await notifyWebhook(webhookUrl, { downloadUrl, jobId: job.id });
    }

    return { downloadUrl };
  },
  {
    connection,
    concurrency: 10, // Tune so total workers × concurrency stays within rate limit
  }
);

worker.on('failed', (job, err) => {
  console.error(`Job ${job?.id} failed:`, err.message);
  // Trigger PagerDuty / Slack alert
});

Kubernetes Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: pdf-worker-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: pdf-worker
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: External
      external:
        metric:
          name: bullmq_queue_size
          selector:
            matchLabels:
              queue: pdf-generation
        target:
          type: AverageValue
          averageValue: "50" # Max 50 queued jobs per worker pod

When scaling out workers, remember the API rate limit stays fixed. Keep workers × concurrency within your plan's per-minute limit.

Checklist

  • Batch and bulk jobs processed via queue, not synchronous requests
  • Worker concurrency tuned to stay within API rate limits
  • Queue depth (backlog size) monitored
  • Dead Letter Queue configured for failed jobs
  • Worker auto-scaling (HPA or equivalent) verified

7. Security Checklist

Before going live, verify the following security controls. The full guide with code examples is in PDF API Security Guide.

Input Validation

// Escape all user input before embedding in HTML
function escapeHtml(str) {
  return String(str)
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;')
    .replace(/"/g, '&quot;')
    .replace(/'/g, '&#039;');
}

// Enforce an HTML size limit (e.g., 1MB)
const MAX_HTML_SIZE_BYTES = 1_048_576;

function validateHtmlInput(html) {
  if (Buffer.byteLength(html, 'utf8') > MAX_HTML_SIZE_BYTES) {
    throw new Error('HTML is too large. Maximum size is 1MB.');
  }
}

Security Checklist

Category Check Item Priority
Auth API keys in environment variables, not source code Required
Auth Separate keys for production, staging, and development Required
Transport HTTPS (TLS 1.2+) only Required
Input User input escaped before embedding in HTML Required
Input HTML size validated before sending to API Recommended
Access IP allowlisting to production servers only Recommended
Access API never called directly from frontend JavaScript Required
Data Auto-deletion policy for generated files confirmed Recommended
Audit All API calls logged with timestamps and user IDs Recommended

8. Pre-Launch Deployment Checklist

Use this as a PR template or release checklist before every production deployment.

Setup

  • Production API key provisioned from the dashboard
  • API key stored in secrets manager or environment variable, not in code
  • .env confirmed absent from Git history
  • E2E tests passing on staging environment

Error Handling

  • Request timeout set to 120 seconds or more
  • Exponential backoff retry logic implemented
  • Non-retryable errors trigger immediate alerts (no silent failures)
  • Error logs include request ID, status code, and attempt count

Performance and Scaling

  • Batch and bulk jobs use queue-based async processing
  • Application-side throttling implemented
  • Worker concurrency stays within API rate limits
  • PDF caching implemented where appropriate

Monitoring and Alerting

  • Generation failure rate alert configured (threshold: 5%)
  • Response time alert configured (p99 > 30s)
  • Rate limit headroom alert configured
  • Queue depth monitored
  • Monthly usage visible in dashboard

Security

  • No API keys in frontend code
  • User input HTML-escaped before PDF generation
  • HTTPS (TLS 1.2+) enforced for all API calls
  • IP allowlisting configured for production servers

Cost Management

  • Monthly request volume estimated and within plan limits
  • Caching or change-detection prevents redundant generation
  • Monthly cost review process established

Conclusion

Moving from "it works" to "it works reliably in production" is the real work in PDF API integration. You do not need to implement everything at once. Start with the essentials — API key management, error handling, and basic monitoring — then layer in throttling, batching, caching, and queue-based scaling as traffic grows.

Each topic in this checklist has a dedicated deep-dive:

Try the API in the Playground, review the full API documentation, and explore real-world implementations in the use cases section.

Powered by FUNBREW PDF