May 12, 2026

PDF API Rate Limiting | 429 Errors & Exponential Backoff

APIrate limiting429 errorbackoffPDF generation

As your PDF generation volume grows, you'll eventually hit 429 Too Many Requests. Naive retries just repeat the same error. This guide explains how rate limits work and how to implement proper retry strategies — exponential backoff with jitter, preemptive throttling, and batch concurrency control.

This is a deep-dive companion to the rate limiting section in the PDF API Production Guide.

How Rate Limits Work

429 Response Headers

When you hit the rate limit, the response includes metadata you should use.

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1748736030
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Retry after 30 seconds.",
  "retry_after": 30
}
Header Meaning
Retry-After Seconds to wait before the next request
X-RateLimit-Limit Maximum requests allowed in the time window
X-RateLimit-Remaining Requests remaining in the current window
X-RateLimit-Reset UNIX timestamp when the window resets

Time Windows

FUNBREW PDF enforces rate limits across multiple time windows.

Window Limit type
Per minute Burst (concentrated requests)
Per hour Mid-term usage
Per day Plan-level total request cap

For specific numbers per plan, see the API Documentation.

Basic Retry with Retry-After

The minimum correct implementation: when you receive a 429, wait exactly as long as Retry-After specifies.

async function callWithRetry(requestFn, maxAttempts = 5) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const response = await requestFn();

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('retry-after') || '10', 10);
      console.log(`Rate limited. Waiting ${retryAfter}s (attempt ${attempt}/${maxAttempts})`);

      if (attempt === maxAttempts) {
        throw new Error(`Rate limit exceeded after ${maxAttempts} attempts`);
      }

      await sleep(retryAfter * 1000);
      continue;
    }

    if (!response.ok) {
      throw new Error(`API error: ${response.status}`);
    }

    return response;
  }
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Exponential Backoff with Jitter

For a more robust implementation that also handles transient errors (500s, timeouts), use exponential backoff with randomized jitter to prevent thundering-herd retries.

const RETRYABLE_STATUS_CODES = new Set([408, 429, 500, 502, 503, 504]);

async function fetchWithExponentialBackoff(url, options = {}, config = {}) {
  const {
    maxAttempts = 5,
    baseDelayMs = 500,
    maxDelayMs = 30000,
    backoffMultiplier = 2,
    jitterFactor = 0.2, // randomize ±20% to spread concurrent retries
  } = config;

  let delay = baseDelayMs;

  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      const response = await fetch(url, options);

      if (response.status === 429) {
        // Honor Retry-After over backoff delay
        const retryAfterHeader = response.headers.get('retry-after');
        const waitMs = retryAfterHeader
          ? parseInt(retryAfterHeader, 10) * 1000
          : delay;

        if (attempt === maxAttempts) throw new Error('Max attempts reached (429)');

        const jitter = waitMs * jitterFactor * Math.random();
        await sleep(waitMs + jitter);
        delay = Math.min(delay * backoffMultiplier, maxDelayMs);
        continue;
      }

      if (RETRYABLE_STATUS_CODES.has(response.status) && attempt < maxAttempts) {
        const jitter = delay * jitterFactor * Math.random();
        await sleep(delay + jitter);
        delay = Math.min(delay * backoffMultiplier, maxDelayMs);
        continue;
      }

      return response;
    } catch (networkError) {
      if (attempt === maxAttempts) throw networkError;
      await sleep(delay);
      delay = Math.min(delay * backoffMultiplier, maxDelayMs);
    }
  }
}

Example delay progression (baseDelayMs=500, backoffMultiplier=2, jitter ±20%):

Attempt Theoretical delay With jitter
1st 500ms 400–600ms
2nd 1s 0.8–1.2s
3rd 2s 1.6–2.4s
4th 4s 3.2–4.8s
5th (final) 8s 6.4–9.6s

Preemptive Throttling

Rather than waiting for a 429, control your request rate proactively to avoid hitting the limit in the first place.

Token Bucket Rate Limiter

class RateLimiter {
  constructor({ requestsPerMinute }) {
    this.interval = (60 * 1000) / requestsPerMinute; // ms between requests
    this.lastRequestTime = 0;
    this.queue = [];
    this.processing = false;
  }

  async schedule(fn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ fn, resolve, reject });
      this.processQueue();
    });
  }

  async processQueue() {
    if (this.processing) return;
    this.processing = true;

    while (this.queue.length > 0) {
      const elapsed = Date.now() - this.lastRequestTime;
      const waitTime = Math.max(0, this.interval - elapsed);
      if (waitTime > 0) await sleep(waitTime);

      const { fn, resolve, reject } = this.queue.shift();
      this.lastRequestTime = Date.now();

      try {
        resolve(await fn());
      } catch (err) {
        reject(err);
      }
    }

    this.processing = false;
  }
}

// Limit to 50 requests per minute
const limiter = new RateLimiter({ requestsPerMinute: 50 });

async function generatePdf(html, filename) {
  return limiter.schedule(() =>
    fetch('https://pdf.funbrew.cloud/api/v1/pdf', {
      method: 'POST',
      headers: { 'Authorization': `Bearer ${process.env.API_KEY}` },
      body: JSON.stringify({ html, filename }),
    })
  );
}

Adaptive Throttling from X-RateLimit-Remaining

Read remaining capacity from response headers and automatically slow down when it gets low.

class AdaptiveRateLimiter {
  constructor({ safetyMargin = 0.2 } = {}) {
    this.remainingRequests = Infinity;
    this.resetTime = null;
  }

  updateFromHeaders(headers) {
    const remaining = headers.get('x-ratelimit-remaining');
    const reset = headers.get('x-ratelimit-reset');
    if (remaining !== null) this.remainingRequests = parseInt(remaining, 10);
    if (reset !== null) this.resetTime = parseInt(reset, 10) * 1000;
  }

  async waitIfNeeded() {
    if (this.remainingRequests === Infinity) return;

    if (this.remainingRequests <= 0 && this.resetTime) {
      const waitMs = Math.max(0, this.resetTime - Date.now());
      console.log(`Remaining = 0. Waiting ${Math.ceil(waitMs / 1000)}s for reset.`);
      await sleep(waitMs + 100);
      return;
    }

    // Add a small delay when remaining drops below 10
    if (this.remainingRequests < 10) {
      await sleep(500);
    }
  }
}

const adaptive = new AdaptiveRateLimiter();

async function generatePdfAdaptive(html, filename) {
  await adaptive.waitIfNeeded();

  const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${process.env.API_KEY}` },
    body: JSON.stringify({ html, filename }),
  });

  adaptive.updateFromHeaders(response.headers);
  return response;
}

Batch Processing with Concurrency Control

When generating many PDFs, limit parallelism to stay within rate limits.

async function generateBatch(items, { concurrency = 5, requestsPerMinute = 50 } = {}) {
  const limiter = new RateLimiter({ requestsPerMinute });
  const results = [];
  const errors = [];

  const chunks = chunkArray(items, concurrency);

  for (const chunk of chunks) {
    const chunkResults = await Promise.allSettled(
      chunk.map(item =>
        limiter.schedule(() => generatePdf(item.html, item.filename))
      )
    );

    for (const result of chunkResults) {
      if (result.status === 'fulfilled') {
        results.push(result.value);
      } else {
        errors.push(result.reason);
      }
    }
  }

  return { results, errors };
}

function chunkArray(arr, size) {
  const chunks = [];
  for (let i = 0; i < arr.length; i += size) chunks.push(arr.slice(i, i + size));
  return chunks;
}

Python Implementation

import asyncio
import time
from aiohttp import ClientSession

class RateLimiter:
    def __init__(self, requests_per_minute: int):
        self.interval = 60.0 / requests_per_minute
        self.last_request_time = 0.0
        self._lock = asyncio.Lock()

    async def acquire(self):
        async with self._lock:
            elapsed = time.monotonic() - self.last_request_time
            wait_time = max(0, self.interval - elapsed)
            if wait_time > 0:
                await asyncio.sleep(wait_time)
            self.last_request_time = time.monotonic()

async def generate_pdf_with_retry(
    session: ClientSession,
    limiter: RateLimiter,
    payload: dict,
    max_attempts: int = 5,
) -> dict:
    delay = 0.5
    for attempt in range(1, max_attempts + 1):
        await limiter.acquire()
        async with session.post(
            "https://pdf.funbrew.cloud/api/v1/pdf",
            json=payload,
        ) as resp:
            if resp.status == 429:
                retry_after = int(resp.headers.get("retry-after", str(delay)))
                if attempt == max_attempts:
                    raise RuntimeError("Max retry attempts reached")
                await asyncio.sleep(retry_after)
                delay = min(delay * 2, 30)
                continue
            resp.raise_for_status()
            return await resp.json()
    raise RuntimeError("Exhausted retry loop")

Monitoring Rate Limit Health

Track your rate limit hit rate over time to decide when to upgrade your plan or redesign batch jobs.

const metrics = { totalRequests: 0, rateLimitHits: 0 };

async function generatePdfWithMetrics(html, filename) {
  metrics.totalRequests++;
  try {
    return await fetchWithExponentialBackoff(
      'https://pdf.funbrew.cloud/api/v1/pdf',
      { method: 'POST', body: JSON.stringify({ html, filename }) }
    );
  } catch (err) {
    if (err.message.includes('429')) metrics.rateLimitHits++;
    throw err;
  }
}

// Log hit rate every hour
setInterval(() => {
  const hitRate = metrics.totalRequests > 0
    ? (metrics.rateLimitHits / metrics.totalRequests * 100).toFixed(1)
    : 0;
  console.log(`Rate limit hit rate: ${hitRate}% (${metrics.rateLimitHits}/${metrics.totalRequests})`);
  if (parseFloat(hitRate) > 5) {
    console.warn('Hit rate above 5% — consider upgrading plan or reducing concurrency');
  }
}, 60 * 60 * 1000);

Summary

Key principles for rate limit handling:

  • Always read Retry-After: never retry sooner than the header specifies
  • Add jitter to backoff: spreads retries from multiple workers to avoid synchronized hammering
  • Throttle preemptively: a local rate limiter prevents hitting the API limit in the first place
  • Limit batch concurrency: set concurrency so workers × concurrency ≤ per-minute limit
  • Monitor X-RateLimit-Remaining: back off automatically before you hit zero

For broader production deployment advice, see the PDF API Production Guide. For error handling across all status codes, see the PDF API Error Handling Guide.

Related

Powered by FUNBREW PDF