PDF API Rate Limiting | 429 Errors & Exponential Backoff
As your PDF generation volume grows, you'll eventually hit 429 Too Many Requests. Naive retries just repeat the same error. This guide explains how rate limits work and how to implement proper retry strategies — exponential backoff with jitter, preemptive throttling, and batch concurrency control.
This is a deep-dive companion to the rate limiting section in the PDF API Production Guide.
How Rate Limits Work
429 Response Headers
When you hit the rate limit, the response includes metadata you should use.
HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1748736030
Content-Type: application/json
{
"error": "rate_limit_exceeded",
"message": "Too many requests. Retry after 30 seconds.",
"retry_after": 30
}
| Header | Meaning |
|---|---|
Retry-After |
Seconds to wait before the next request |
X-RateLimit-Limit |
Maximum requests allowed in the time window |
X-RateLimit-Remaining |
Requests remaining in the current window |
X-RateLimit-Reset |
UNIX timestamp when the window resets |
Time Windows
FUNBREW PDF enforces rate limits across multiple time windows.
| Window | Limit type |
|---|---|
| Per minute | Burst (concentrated requests) |
| Per hour | Mid-term usage |
| Per day | Plan-level total request cap |
For specific numbers per plan, see the API Documentation.
Basic Retry with Retry-After
The minimum correct implementation: when you receive a 429, wait exactly as long as Retry-After specifies.
async function callWithRetry(requestFn, maxAttempts = 5) {
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
const response = await requestFn();
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get('retry-after') || '10', 10);
console.log(`Rate limited. Waiting ${retryAfter}s (attempt ${attempt}/${maxAttempts})`);
if (attempt === maxAttempts) {
throw new Error(`Rate limit exceeded after ${maxAttempts} attempts`);
}
await sleep(retryAfter * 1000);
continue;
}
if (!response.ok) {
throw new Error(`API error: ${response.status}`);
}
return response;
}
}
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
Exponential Backoff with Jitter
For a more robust implementation that also handles transient errors (500s, timeouts), use exponential backoff with randomized jitter to prevent thundering-herd retries.
const RETRYABLE_STATUS_CODES = new Set([408, 429, 500, 502, 503, 504]);
async function fetchWithExponentialBackoff(url, options = {}, config = {}) {
const {
maxAttempts = 5,
baseDelayMs = 500,
maxDelayMs = 30000,
backoffMultiplier = 2,
jitterFactor = 0.2, // randomize ±20% to spread concurrent retries
} = config;
let delay = baseDelayMs;
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
try {
const response = await fetch(url, options);
if (response.status === 429) {
// Honor Retry-After over backoff delay
const retryAfterHeader = response.headers.get('retry-after');
const waitMs = retryAfterHeader
? parseInt(retryAfterHeader, 10) * 1000
: delay;
if (attempt === maxAttempts) throw new Error('Max attempts reached (429)');
const jitter = waitMs * jitterFactor * Math.random();
await sleep(waitMs + jitter);
delay = Math.min(delay * backoffMultiplier, maxDelayMs);
continue;
}
if (RETRYABLE_STATUS_CODES.has(response.status) && attempt < maxAttempts) {
const jitter = delay * jitterFactor * Math.random();
await sleep(delay + jitter);
delay = Math.min(delay * backoffMultiplier, maxDelayMs);
continue;
}
return response;
} catch (networkError) {
if (attempt === maxAttempts) throw networkError;
await sleep(delay);
delay = Math.min(delay * backoffMultiplier, maxDelayMs);
}
}
}
Example delay progression (baseDelayMs=500, backoffMultiplier=2, jitter ±20%):
| Attempt | Theoretical delay | With jitter |
|---|---|---|
| 1st | 500ms | 400–600ms |
| 2nd | 1s | 0.8–1.2s |
| 3rd | 2s | 1.6–2.4s |
| 4th | 4s | 3.2–4.8s |
| 5th (final) | 8s | 6.4–9.6s |
Preemptive Throttling
Rather than waiting for a 429, control your request rate proactively to avoid hitting the limit in the first place.
Token Bucket Rate Limiter
class RateLimiter {
constructor({ requestsPerMinute }) {
this.interval = (60 * 1000) / requestsPerMinute; // ms between requests
this.lastRequestTime = 0;
this.queue = [];
this.processing = false;
}
async schedule(fn) {
return new Promise((resolve, reject) => {
this.queue.push({ fn, resolve, reject });
this.processQueue();
});
}
async processQueue() {
if (this.processing) return;
this.processing = true;
while (this.queue.length > 0) {
const elapsed = Date.now() - this.lastRequestTime;
const waitTime = Math.max(0, this.interval - elapsed);
if (waitTime > 0) await sleep(waitTime);
const { fn, resolve, reject } = this.queue.shift();
this.lastRequestTime = Date.now();
try {
resolve(await fn());
} catch (err) {
reject(err);
}
}
this.processing = false;
}
}
// Limit to 50 requests per minute
const limiter = new RateLimiter({ requestsPerMinute: 50 });
async function generatePdf(html, filename) {
return limiter.schedule(() =>
fetch('https://pdf.funbrew.cloud/api/v1/pdf', {
method: 'POST',
headers: { 'Authorization': `Bearer ${process.env.API_KEY}` },
body: JSON.stringify({ html, filename }),
})
);
}
Adaptive Throttling from X-RateLimit-Remaining
Read remaining capacity from response headers and automatically slow down when it gets low.
class AdaptiveRateLimiter {
constructor({ safetyMargin = 0.2 } = {}) {
this.remainingRequests = Infinity;
this.resetTime = null;
}
updateFromHeaders(headers) {
const remaining = headers.get('x-ratelimit-remaining');
const reset = headers.get('x-ratelimit-reset');
if (remaining !== null) this.remainingRequests = parseInt(remaining, 10);
if (reset !== null) this.resetTime = parseInt(reset, 10) * 1000;
}
async waitIfNeeded() {
if (this.remainingRequests === Infinity) return;
if (this.remainingRequests <= 0 && this.resetTime) {
const waitMs = Math.max(0, this.resetTime - Date.now());
console.log(`Remaining = 0. Waiting ${Math.ceil(waitMs / 1000)}s for reset.`);
await sleep(waitMs + 100);
return;
}
// Add a small delay when remaining drops below 10
if (this.remainingRequests < 10) {
await sleep(500);
}
}
}
const adaptive = new AdaptiveRateLimiter();
async function generatePdfAdaptive(html, filename) {
await adaptive.waitIfNeeded();
const response = await fetch('https://pdf.funbrew.cloud/api/v1/pdf', {
method: 'POST',
headers: { 'Authorization': `Bearer ${process.env.API_KEY}` },
body: JSON.stringify({ html, filename }),
});
adaptive.updateFromHeaders(response.headers);
return response;
}
Batch Processing with Concurrency Control
When generating many PDFs, limit parallelism to stay within rate limits.
async function generateBatch(items, { concurrency = 5, requestsPerMinute = 50 } = {}) {
const limiter = new RateLimiter({ requestsPerMinute });
const results = [];
const errors = [];
const chunks = chunkArray(items, concurrency);
for (const chunk of chunks) {
const chunkResults = await Promise.allSettled(
chunk.map(item =>
limiter.schedule(() => generatePdf(item.html, item.filename))
)
);
for (const result of chunkResults) {
if (result.status === 'fulfilled') {
results.push(result.value);
} else {
errors.push(result.reason);
}
}
}
return { results, errors };
}
function chunkArray(arr, size) {
const chunks = [];
for (let i = 0; i < arr.length; i += size) chunks.push(arr.slice(i, i + size));
return chunks;
}
Python Implementation
import asyncio
import time
from aiohttp import ClientSession
class RateLimiter:
def __init__(self, requests_per_minute: int):
self.interval = 60.0 / requests_per_minute
self.last_request_time = 0.0
self._lock = asyncio.Lock()
async def acquire(self):
async with self._lock:
elapsed = time.monotonic() - self.last_request_time
wait_time = max(0, self.interval - elapsed)
if wait_time > 0:
await asyncio.sleep(wait_time)
self.last_request_time = time.monotonic()
async def generate_pdf_with_retry(
session: ClientSession,
limiter: RateLimiter,
payload: dict,
max_attempts: int = 5,
) -> dict:
delay = 0.5
for attempt in range(1, max_attempts + 1):
await limiter.acquire()
async with session.post(
"https://pdf.funbrew.cloud/api/v1/pdf",
json=payload,
) as resp:
if resp.status == 429:
retry_after = int(resp.headers.get("retry-after", str(delay)))
if attempt == max_attempts:
raise RuntimeError("Max retry attempts reached")
await asyncio.sleep(retry_after)
delay = min(delay * 2, 30)
continue
resp.raise_for_status()
return await resp.json()
raise RuntimeError("Exhausted retry loop")
Monitoring Rate Limit Health
Track your rate limit hit rate over time to decide when to upgrade your plan or redesign batch jobs.
const metrics = { totalRequests: 0, rateLimitHits: 0 };
async function generatePdfWithMetrics(html, filename) {
metrics.totalRequests++;
try {
return await fetchWithExponentialBackoff(
'https://pdf.funbrew.cloud/api/v1/pdf',
{ method: 'POST', body: JSON.stringify({ html, filename }) }
);
} catch (err) {
if (err.message.includes('429')) metrics.rateLimitHits++;
throw err;
}
}
// Log hit rate every hour
setInterval(() => {
const hitRate = metrics.totalRequests > 0
? (metrics.rateLimitHits / metrics.totalRequests * 100).toFixed(1)
: 0;
console.log(`Rate limit hit rate: ${hitRate}% (${metrics.rateLimitHits}/${metrics.totalRequests})`);
if (parseFloat(hitRate) > 5) {
console.warn('Hit rate above 5% — consider upgrading plan or reducing concurrency');
}
}, 60 * 60 * 1000);
Summary
Key principles for rate limit handling:
- Always read
Retry-After: never retry sooner than the header specifies - Add jitter to backoff: spreads retries from multiple workers to avoid synchronized hammering
- Throttle preemptively: a local rate limiter prevents hitting the API limit in the first place
- Limit batch concurrency: set
concurrencysoworkers × concurrency ≤ per-minute limit - Monitor
X-RateLimit-Remaining: back off automatically before you hit zero
For broader production deployment advice, see the PDF API Production Guide. For error handling across all status codes, see the PDF API Error Handling Guide.
Related
- PDF API Production Guide — Broader production stability guide
- PDF API Error Handling Guide — All error codes and responses
- PDF API Batch Processing Guide — High-volume PDF generation patterns
- API Documentation — Rate limit specifications per plan
- Playground — Test API behavior interactively