API Rate Limit Error Handling

Understand Skytells API rate limits, how they work, and how to handle them gracefully.

Rate Limits

The Skytells API enforces per-account rate limits to ensure fair usage and platform stability. When you exceed a limit, the API returns a 429 Too Many Requests response.

How Rate Limits Work

Rate limits are applied at the account level across all API keys belonging to the same account. Limits are measured over a rolling time window.

Error Response

When rate-limited, the API responds with the unified error envelope (same as other REST errors):

{
  "status": false,
  "response": "Too many requests",
  "error": {
    "http_status": 429,
    "message": "Too many requests. Please wait before retrying.",
    "details": "Too many requests. Please wait before retrying.",
    "error_id": "RATE_LIMIT_EXCEEDED"
  }
}

The response also includes a Retry-After header:

Retry-After: 12

This tells you the number of seconds to wait before the next request will be accepted.

Always read the Retry-After header instead of guessing wait times. Ignoring it and retrying immediately will keep returning 429 and waste your quota recovery window.

Handling Rate Limits

The recommended strategy is exponential backoff with jitter — wait progressively longer between retries, with a small random offset to spread out requests from multiple clients.

Retry with exponential backoff

TypeScript

async function fetchWithRetry(url: string, options: RequestInit, retries = 5) {
  for (let attempt = 0; attempt < retries; attempt++) {
    const res = await fetch(url, options);
    if (res.status !== 429) return res;

    const retryAfter = Number(res.headers.get('Retry-After') ?? 1);
    const jitter = Math.random() * 500;
    await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
  }
  throw new Error('Max retries exceeded');
}

Best Practices

Practice	Why
Read `Retry-After` header	Avoid guessing wait durations
Use exponential backoff	Spread retry load over time
Add random jitter	Prevent synchronized retries from multiple clients
Cache model listings	`GET /v1/models` results change infrequently — avoid polling it
Batch polling into longer intervals	Poll prediction status every 2–5s rather than as fast as possible

Sustained high-frequency polling for prediction status is the most common cause of hitting rate limits. Switch to webhooks to eliminate polling entirely — see the Webhooks reference.

How is this guide?