Workflow Patterns

Master the three foundational workflow patterns — prompt chaining, routing, and parallelization — that form the building blocks of every AI system.

What you'll learn in this module

How to build reliable prompt chains with gates
How to route inputs to specialized handlers
How to parallelize independent LLM calls for speed
When each pattern is the right choice

Pattern 1: Prompt Chaining

The simplest multi-step pattern. Each LLM call processes the output of the previous one, transforming data through a pipeline.

How it works

Break a complex task into sequential subtasks
Each step has a focused prompt optimized for its specific job
Between steps, add gates — programmatic checks that verify quality before proceeding

When to use it

The task naturally decomposes into distinct phases (draft → review → polish)
You need higher accuracy than a single call can deliver
You want to trade latency for quality

Example: Content Pipeline

Step	Task	Gate
1	Generate a blog outline from a topic and audience	Check outline has 3–7 sections
2	Expand each section into paragraphs	Check word count > 200 per section
3	Edit for tone and grammar	Verify no placeholder text remains
4	Generate meta description and title	Check length constraints

Implementation sketch

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SKYTELLS_API_KEY,
  baseURL: "https://api.skytells.ai/v1",
});

async function contentPipeline(topic: string, audience: string) {
  // Step 1: Generate outline
  const outline = await client.chat.completions.create({
    model: "deepbrain-router",
    messages: [
      { role: "user", content: `Create a blog outline about "${topic}" for ${audience}. Return 3-7 sections as JSON.` },
    ],
  });
  const outlineText = outline.choices[0].message.content!;

  // Gate: verify outline structure
  const sections = JSON.parse(outlineText);
  if (sections.length < 3 || sections.length > 7) {
    throw new Error("Outline must have 3-7 sections");
  }

  // Step 2: Expand each section
  const expanded = await client.chat.completions.create({
    model: "deepbrain-router",
    messages: [
      { role: "user", content: `Expand this outline into full paragraphs:\n${outlineText}` },
    ],
  });
  const expandedText = expanded.choices[0].message.content!;

  // Step 3: Edit for tone
  const edited = await client.chat.completions.create({
    model: "deepbrain-router",
    messages: [
      { role: "user", content: `Edit the following for a professional tone. Remove any placeholder text:\n${expandedText}` },
    ],
  });

  return edited.choices[0].message.content;
}

Orchestrator mapping: A prompt chain maps directly to a linear sequence of action nodes. Gates become Condition nodes between steps — if the gate fails, the workflow can branch to an error handler or retry path.

Pattern 2: Routing

A classifier LLM examines the input and directs it to a specialized handler. This is how you get both breadth (handling many input types) and depth (each handler is optimized for its specific case).

How it works

A lightweight LLM call classifies the input into categories
Based on the classification, route to a specialized prompt/workflow
Each handler can use different models, prompts, or even entirely different tools

When to use it

Inputs vary widely in type or complexity
Different input types need fundamentally different processing
You want to optimize cost by using cheaper models for simple inputs

Example: Support Ticket Router

Category	Model	Tools available	Response template
Billing	GPT-4-class	Payment API, refund system	Formal, include account details
Technical	GPT-4-class	Log search, documentation RAG	Technical, include steps to reproduce
General	GPT-3.5-class	Knowledge base search	Friendly, link to help articles

The classifier prompt

The quality of routing depends entirely on the classifier. Design it carefully:

Classify the following support ticket into exactly one category:
- billing: payment issues, subscription changes, refunds, invoices
- technical: bugs, errors, API issues, integration problems
- general: how-to questions, feature requests, feedback

Respond with only the category name, nothing else.

Ticket: {{input}}

Tips for reliable classification:

Give clear, mutually exclusive category definitions
Include 2-3 example keywords per category
Ask for a single-word response to avoid parsing issues
Test with edge cases that sit between categories

Orchestrator mapping: The classifier is an AI action node. The routing is a Condition node that checks the classifier's output and branches to different action sequences. Each branch can have its own integrations and tools.

Pattern 3: Parallelization

When subtasks are independent, run them simultaneously instead of sequentially. This dramatically reduces latency.

Sectioning

Split a task into independent parallel subtasks:

Voting

Run the same task multiple times and aggregate results for higher confidence:

When to use parallelization

Variant	Use when	Benefit
Sectioning	Task has independent subtasks	Lower latency (wall-clock time)
Voting	Task needs high confidence on a single answer	Higher accuracy

Example: Document Analysis

Process a document three ways simultaneously:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SKYTELLS_API_KEY,
  baseURL: "https://api.skytells.ai/v1",
});

async function analyzeDocument(document: string) {
  const [summary, entities, sentiment] = await Promise.all([
    client.chat.completions.create({
      model: "deepbrain-router",
      messages: [{ role: "user", content: `Summarize this document in 3 sentences:\n${document}` }],
    }),
    client.chat.completions.create({
      model: "deepbrain-router",
      messages: [{ role: "user", content: `Extract all named entities (people, companies, locations) as JSON:\n${document}` }],
    }),
    client.chat.completions.create({
      model: "deepbrain-router",
      messages: [{ role: "user", content: `Analyze the sentiment (positive/neutral/negative) with confidence score:\n${document}` }],
    }),
  ]);

  return {
    summary: summary.choices[0].message.content,
    entities: JSON.parse(entities.choices[0].message.content!),
    sentiment: sentiment.choices[0].message.content,
  };
}

Orchestrator mapping: In Orchestrator, create multiple action nodes that all originate from the same parent node. The execution engine runs independent branches concurrently. Fan-in happens at the next Condition or action node that references outputs from multiple branches.

Combining Patterns

The real power comes from composing these patterns. A production system often looks like:

Decision framework

When designing a workflow, evaluate at each step:

Question	If yes →
Can this step be broken into sequential phases?	Chain them
Does the input need different handling by type?	Route first
Are there independent subtasks?	Parallelize them
Does the output need to be high-confidence?	Use voting

Start simple. Add complexity only when measurement shows you need it.

What you now understand

Pattern	Architecture	Key benefit
Prompt Chaining	Sequential LLM calls with gates	Higher accuracy through decomposition
Routing	Classifier + specialized handlers	Optimized handling per input type
Parallelization	Concurrent independent calls	Lower latency or higher confidence
Composition	Patterns combined	Production-grade flexibility

Up next: Advanced Orchestration Patterns — orchestrator-workers, evaluator-optimizer loops, and autonomous agents.

PreviousAgentic AI Fundamentals NextAdvanced Orchestration Patterns

On this page