Sync vs. Async Patterns

Choose the right generation architecture for every use case — real-time response, optimistic UI, background queue, and push notification patterns.

What you'll be able to build after this module

Select the right architecture pattern based on your generation time and user experience requirements — and implement it correctly the first time.

The fundamental decision tree

Pattern 1: Direct synchronous response

When: Generation completes in < 3 seconds. User is waiting at the screen.

Models: truefusion-edge (~1.5s), chat endpoints, fast audio

// app/api/preview/route.ts
import Skytells from '@skytells/sdk';

const client = Skytells(process.env.SKYTELLS_API_KEY, {
  baseUrl: 'https://edge.skytells.ai/v1', // Edge for < 2s response
});

export const runtime = 'edge';

export async function POST(req: Request) {
  const { prompt } = await req.json();

  const prediction = await client.predictions.create({
    model: 'truefusion-edge',
    input: { prompt, width: 512, height: 512, num_inference_steps: 4 },
  });

  // SDK polls — prediction is already succeeded here
  return Response.json({ url: prediction.output![0] });
}

Pattern 1 requires the Edge API (Business/Enterprise). For lower plans, use truefusion with Pattern 2 — it averages ~8s but is perfectly smooth with a loading spinner.

Pattern 2: Optimistic UI + polling

When: Generation takes 5–15 seconds. User is waiting, but a loading state is acceptable.

Models: truefusion-pro (~8s), truefusion-2.0 (~12s)

// app/api/generate/route.ts
export async function POST(req: Request) {
  const { prompt } = await req.json();

  // Don't wait for completion — return the prediction ID immediately
  const prediction = await client.predictions.create({
    model: 'truefusion-pro',
    input: { prompt, width: 1024, height: 1024 },
    wait: false, // Return immediately
  });

  return Response.json({
    predictionId: prediction.id,
    status: prediction.status, // 'queued' or 'processing'
  });
}

// app/api/generate/[id]/route.ts
export async function GET(_req: Request, { params }: { params: { id: string } }) {
  const prediction = await client.predictions.get(params.id);
  return Response.json({
    status: prediction.status,
    output: prediction.output ?? null,
    error: prediction.error ?? null,
  });
}

Pattern 3: Fire-and-forget + webhook notification

When: Generation takes > 15 seconds (video, audio, high-quality batch).

Models: truefusion-video-pro (30–120s), beatfusion-2.0 (30–90s), mera (2–5min)

// POST /api/generate
export async function POST(req: Request) {
  const { prompt, userId } = await req.json();

  const prediction = await client.predictions.create({
    model: 'truefusion-video-pro',
    input: { prompt, duration_seconds: 10 },
    webhook: `${process.env.BASE_URL}/api/webhooks/skytells`,
    webhookEventsFilter: ['completed'],
    wait: false,
  });

  // Store job
  await db.jobs.create({
    data: { id: prediction.id, userId, status: 'pending' },
  });

  // Return immediately — don't make the user wait
  return Response.json({ jobId: prediction.id });
}

// POST /api/webhooks/skytells
export async function POST(req: Request) {
  const raw = await req.text();
  // ... (verify signature — see Webhooks module)
  const prediction = JSON.parse(raw);

  await db.jobs.update({
    where: { id: prediction.id },
    data: { status: prediction.status, outputUrl: prediction.output?.[0] },
  });

  // Notify user via your push system (Pusher, SSE, WebSocket, email, etc.)
  await notifyUser(prediction.id, prediction.status);

  return Response.json({ received: true });
}

Pattern 4: Background queue (batch processing)

When: You need to process many requests at once — batch generation, scheduled jobs, bulk exports.

// workers/generation.ts
import { Queue, Worker } from 'bullmq';
import Skytells from '@skytells/sdk';

const client = Skytells(process.env.SKYTELLS_API_KEY);

export const generationQueue = new Queue('generation', {
  connection: { host: process.env.REDIS_HOST, port: 6379 },
  defaultJobOptions: {
    attempts: 3,
    backoff: { type: 'exponential', delay: 2000 },
  },
});

// Worker — processes jobs at a controlled rate
new Worker('generation', async (job) => {
  const { prompt, userId, jobId } = job.data;

  const prediction = await client.predictions.create({
    model: 'truefusion-pro',
    input: { prompt, width: 1024, height: 1024 },
  });

  await db.results.create({
    data: { jobId, userId, outputUrl: prediction.output![0], status: 'done' },
  });

  return { predictionId: prediction.id };
}, {
  connection: { host: process.env.REDIS_HOST, port: 6379 },
  concurrency: 5, // process 5 jobs at once
  limiter: { max: 8, duration: 1000 }, // max 8/second
});

Choosing the right pattern

Pattern	Latency	Complexity	When to use
1: Direct sync	< 3s	Low	Real-time previews, Edge API
2: Optimistic UI + poll	5–15s	Low	Most image generation
3: Fire-and-forget + webhook	Any	Medium	Video, audio, long jobs
4: Background queue	Any	High	Batch processing, high volume

Don't over-engineer. Pattern 2 (optimistic UI + polling) handles 80% of use cases beautifully. Only add webhooks (Pattern 3) or queues (Pattern 4) when generation time or scale genuinely requires it.

Summary

You can now design the right architecture for any generation use case. Pick the pattern that matches your latency requirements and user expectations — not the most complex one.

The four patterns:

Direct sync — sub-3s, Edge API, simple await
Optimistic UI + poll — 5–15s, return job ID, poll every 2s on the frontend
Webhook — any duration, fire-and-forget, push result to user
Background queue — high volume, controlled rate, Celery or BullMQ

Next: caching strategies to reduce costs and improve response times.

NextCaching, Deduplication & Cost Optimization

On this page