Rate Limits & Error Handling

Handle 429s gracefully, build a prediction queue, write user-friendly error messages, and set budget guardrails — for a resilient production app.

What you'll be able to do after this module

Build an error handling layer that never crashes under load — automatic retry with backoff, per-user queuing, budget guardrails, and user-friendly error messages that don't leak implementation details.

Rate limits by plan

Plan	Standard req/min	Burst	Edge
Free	60	Limited	—
Pro	600	Allowed	—
Business	Custom	Custom	✓
Enterprise	Custom	Custom	✓

When you exceed your limit, the API returns 429 Too Many Requests with a Retry-After header telling you exactly when to retry.

Error taxonomy — retryable vs. not

The most important distinction in error handling:

HTTP Status	Category	Action
`400`	Bad request	Fix your code — do not retry
`401`	Invalid API key	Fix your key — do not retry
`403`	Plan/permission issue	Upgrade or fix permissions — do not retry
`422`	Invalid input	Fix the request body — do not retry
`429`	Rate limited	Retry with backoff after `Retry-After`
`500`	Server error	Retry with backoff
`502`, `503`	Service unavailable	Retry with backoff
`504`	Gateway timeout	Retry with backoff

Retrying 4xx errors (except 429) is almost always wrong. 422 Invalid Input will never succeed no matter how many times you retry — fix the request instead.

Exponential backoff with jitter

Never retry immediately on 429. Add a delay that grows with each attempt, plus random jitter to prevent synchronized retry storms:

async function withRetry<T>(
  fn: () => Promise<T>,
  options = { maxAttempts: 5, baseDelayMs: 500 },
): Promise<T> {
  let lastError: Error;

  for (let attempt = 0; attempt < options.maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (err: unknown) {
      lastError = err instanceof Error ? err : new Error(String(err));

      const isRetryable =
        lastError.message.includes('429') ||
        /5\d{2}/.test(lastError.message);

      if (!isRetryable) throw lastError; // non-retryable — give up immediately
      if (attempt === options.maxAttempts - 1) break;

      // Exponential backoff + ±20% jitter
      const base = options.baseDelayMs * Math.pow(2, attempt);
      const jitter = base * 0.2 * (Math.random() * 2 - 1);
      const delay = Math.round(base + jitter);
      console.warn(`Attempt ${attempt + 1} failed. Retrying in ${delay}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }

  throw lastError!;
}

// Usage — completely transparent to the calling code
const prediction = await withRetry(() =>
  client.predictions.create({
    model: 'truefusion-pro',
    input: { prompt, width: 1024, height: 1024 },
  })
);

Backoff progression: 500ms → 1s → 2s → 4s → 8s (±jitter)

Prediction queue

For high-throughput apps (batch processing, multiple concurrent users), queue predictions to stay safely under your rate limit:

// lib/prediction-queue.ts
import PQueue from 'p-queue';

// Pro plan: 600 req/min = 10/sec. Stay at 8 to be safe.
const queue = new PQueue({
  concurrency: 10,
  interval: 1000,
  intervalCap: 8,
});

export function enqueuePrediction<T>(fn: () => Promise<T>): Promise<T> {
  return queue.add(fn) as Promise<T>;
}

export function getQueueStats() {
  return { pending: queue.size, running: queue.pending };
}

The queue handles throttling automatically. You call enqueuePrediction() instead of calling Skytells directly — the queue releases each call at the right rate.

Budget guardrails

Prevent unexpected bills with a daily spend limit:

// lib/budget.ts
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();
const DAILY_BUDGET_USD = 50;

const MODEL_COSTS: Record<string, number> = {
  'truefusion-edge': 0.01,
  'truefusion': 0.02,
  'truefusion-pro': 0.04,
  'truefusion-2.0': 0.06,
  'truefusion-ultra': 0.08,
  'beatfusion-2.0': 0.75,
};

function today() { return new Date().toISOString().split('T')[0]; }

export async function checkBudget(model: string): Promise<void> {
  const cost = MODEL_COSTS[model] ?? 0.04;
  const key = `budget:global:${today()}`;
  const current = parseFloat(await redis.get<string>(key) ?? '0');

  if (current + cost > DAILY_BUDGET_USD) {
    throw new BudgetExceededError(
      `Daily budget of $${DAILY_BUDGET_USD} would be exceeded. Current: $${current.toFixed(2)}`
    );
  }
}

export async function recordSpend(model: string): Promise<void> {
  const cost = MODEL_COSTS[model] ?? 0.04;
  const key = `budget:global:${today()}`;
  await redis.incrbyfloat(key, cost);
  await redis.expire(key, 86_400 * 2); // expire after 2 days
}

class BudgetExceededError extends Error {
  readonly name = 'BudgetExceededError';
}

User-facing error messages

Never show raw API error messages to users. Map them to clear, actionable copy:

interface UserFacingError {
  message: string;
  action?: string;
  retryable: boolean;
}

function toUserError(httpStatus: number, detail?: string): UserFacingError {
  switch (httpStatus) {
    case 401:
      return {
        message: 'Authentication failed.',
        action: 'Please refresh the page and try again.',
        retryable: false,
      };
    case 422:
      return {
        message: 'Your request contained an invalid prompt.',
        action: detail?.includes('nsfw') ? 'Try a different description.' : undefined,
        retryable: false,
      };
    case 429:
      return {
        message: 'Our AI service is busy right now.',
        action: 'Please try again in a few seconds.',
        retryable: true,
      };
    case 500:
    case 503:
      return {
        message: "We're experiencing a temporary issue.",
        action: "We're on it. Please try again shortly.",
        retryable: true,
      };
    default:
      return {
        message: 'Something went wrong.',
        action: 'Please try again.',
        retryable: true,
      };
  }
}

Never expose raw Skytells API error messages to your users. They often contain internal model names, stack traces, or validation details that leak implementation details and confuse non-technical users.

Monitoring and alerting

Set up alerts so you know about problems before your users do:

Metric	Alert threshold	Why
Error rate	> 1% of requests over 5 min	Something is wrong
Rate limit rate	> 5% hitting 429	Approaching quota ceiling
P99 prediction latency	> 30s for images	Model backlog or degradation
Daily spend	> 80% of budget	Approaching limit
Failed webhook deliveries	> 0 per hour	Processing pipeline broken

// Log structured events — send to Datadog, Logtail, Sentry, etc.
function logPredictionEvent(event: {
  predictionId: string;
  model: string;
  status: 'created' | 'succeeded' | 'failed' | 'rate_limited';
  latencyMs?: number;
  httpStatus?: number;
}) {
  console.log(JSON.stringify({
    service: 'skytells',
    timestamp: new Date().toISOString(),
    ...event,
  }));
}

Summary

Your app will handle every failure mode gracefully. No crashes, no silent data loss, no surprise bills.

The four resilience patterns:

Exponential backoff — only for 429 and 5xx. Never for 4xx.
Prediction queue — throttle concurrent requests to stay under your rate limit
Budget guardrails — daily spend cap with Redis, checked before each prediction
User-friendly errors — map HTTP status codes to clear, non-technical messages

Next: the Edge API — ultra-low latency for Business and Enterprise accounts.

PreviousWebhooks NextEdge API

Rate Limits & Error Handling

On this page