Workflow Patterns
Master the three foundational workflow patterns — prompt chaining, routing, and parallelization — that form the building blocks of every AI system.
What you'll learn in this module
- How to build reliable prompt chains with gates
- How to route inputs to specialized handlers
- How to parallelize independent LLM calls for speed
- When each pattern is the right choice
Pattern 1: Prompt Chaining
The simplest multi-step pattern. Each LLM call processes the output of the previous one, transforming data through a pipeline.
How it works
- Break a complex task into sequential subtasks
- Each step has a focused prompt optimized for its specific job
- Between steps, add gates — programmatic checks that verify quality before proceeding
When to use it
- The task naturally decomposes into distinct phases (draft → review → polish)
- You need higher accuracy than a single call can deliver
- You want to trade latency for quality
Example: Content Pipeline
| Step | Task | Gate |
|---|---|---|
| 1 | Generate a blog outline from a topic and audience | Check outline has 3–7 sections |
| 2 | Expand each section into paragraphs | Check word count > 200 per section |
| 3 | Edit for tone and grammar | Verify no placeholder text remains |
| 4 | Generate meta description and title | Check length constraints |
Implementation sketch
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SKYTELLS_API_KEY,
baseURL: "https://api.skytells.ai/v1",
});
async function contentPipeline(topic: string, audience: string) {
// Step 1: Generate outline
const outline = await client.chat.completions.create({
model: "deepbrain-router",
messages: [
{ role: "user", content: `Create a blog outline about "${topic}" for ${audience}. Return 3-7 sections as JSON.` },
],
});
const outlineText = outline.choices[0].message.content!;
// Gate: verify outline structure
const sections = JSON.parse(outlineText);
if (sections.length < 3 || sections.length > 7) {
throw new Error("Outline must have 3-7 sections");
}
// Step 2: Expand each section
const expanded = await client.chat.completions.create({
model: "deepbrain-router",
messages: [
{ role: "user", content: `Expand this outline into full paragraphs:\n${outlineText}` },
],
});
const expandedText = expanded.choices[0].message.content!;
// Step 3: Edit for tone
const edited = await client.chat.completions.create({
model: "deepbrain-router",
messages: [
{ role: "user", content: `Edit the following for a professional tone. Remove any placeholder text:\n${expandedText}` },
],
});
return edited.choices[0].message.content;
}Orchestrator mapping: A prompt chain maps directly to a linear sequence of action nodes. Gates become Condition nodes between steps — if the gate fails, the workflow can branch to an error handler or retry path.
Pattern 2: Routing
A classifier LLM examines the input and directs it to a specialized handler. This is how you get both breadth (handling many input types) and depth (each handler is optimized for its specific case).
How it works
- A lightweight LLM call classifies the input into categories
- Based on the classification, route to a specialized prompt/workflow
- Each handler can use different models, prompts, or even entirely different tools
When to use it
- Inputs vary widely in type or complexity
- Different input types need fundamentally different processing
- You want to optimize cost by using cheaper models for simple inputs
Example: Support Ticket Router
| Category | Model | Tools available | Response template |
|---|---|---|---|
| Billing | GPT-4-class | Payment API, refund system | Formal, include account details |
| Technical | GPT-4-class | Log search, documentation RAG | Technical, include steps to reproduce |
| General | GPT-3.5-class | Knowledge base search | Friendly, link to help articles |
The classifier prompt
The quality of routing depends entirely on the classifier. Design it carefully:
Classify the following support ticket into exactly one category:
- billing: payment issues, subscription changes, refunds, invoices
- technical: bugs, errors, API issues, integration problems
- general: how-to questions, feature requests, feedback
Respond with only the category name, nothing else.
Ticket: {{input}}Tips for reliable classification:
- Give clear, mutually exclusive category definitions
- Include 2-3 example keywords per category
- Ask for a single-word response to avoid parsing issues
- Test with edge cases that sit between categories
Orchestrator mapping: The classifier is an AI action node. The routing is a Condition node that checks the classifier's output and branches to different action sequences. Each branch can have its own integrations and tools.
Pattern 3: Parallelization
When subtasks are independent, run them simultaneously instead of sequentially. This dramatically reduces latency.
Sectioning
Split a task into independent parallel subtasks:
Voting
Run the same task multiple times and aggregate results for higher confidence:
When to use parallelization
| Variant | Use when | Benefit |
|---|---|---|
| Sectioning | Task has independent subtasks | Lower latency (wall-clock time) |
| Voting | Task needs high confidence on a single answer | Higher accuracy |
Example: Document Analysis
Process a document three ways simultaneously:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SKYTELLS_API_KEY,
baseURL: "https://api.skytells.ai/v1",
});
async function analyzeDocument(document: string) {
const [summary, entities, sentiment] = await Promise.all([
client.chat.completions.create({
model: "deepbrain-router",
messages: [{ role: "user", content: `Summarize this document in 3 sentences:\n${document}` }],
}),
client.chat.completions.create({
model: "deepbrain-router",
messages: [{ role: "user", content: `Extract all named entities (people, companies, locations) as JSON:\n${document}` }],
}),
client.chat.completions.create({
model: "deepbrain-router",
messages: [{ role: "user", content: `Analyze the sentiment (positive/neutral/negative) with confidence score:\n${document}` }],
}),
]);
return {
summary: summary.choices[0].message.content,
entities: JSON.parse(entities.choices[0].message.content!),
sentiment: sentiment.choices[0].message.content,
};
}Orchestrator mapping: In Orchestrator, create multiple action nodes that all originate from the same parent node. The execution engine runs independent branches concurrently. Fan-in happens at the next Condition or action node that references outputs from multiple branches.
Combining Patterns
The real power comes from composing these patterns. A production system often looks like:
Decision framework
When designing a workflow, evaluate at each step:
| Question | If yes → |
|---|---|
| Can this step be broken into sequential phases? | Chain them |
| Does the input need different handling by type? | Route first |
| Are there independent subtasks? | Parallelize them |
| Does the output need to be high-confidence? | Use voting |
Start simple. Add complexity only when measurement shows you need it.
What you now understand
| Pattern | Architecture | Key benefit |
|---|---|---|
| Prompt Chaining | Sequential LLM calls with gates | Higher accuracy through decomposition |
| Routing | Classifier + specialized handlers | Optimized handling per input type |
| Parallelization | Concurrent independent calls | Lower latency or higher confidence |
| Composition | Patterns combined | Production-grade flexibility |
Up next: Advanced Orchestration Patterns — orchestrator-workers, evaluator-optimizer loops, and autonomous agents.
Agentic AI Fundamentals
Understand what makes an AI system agentic, the concept of the augmented LLM, and how to decide between workflows and agents.
Advanced Orchestration Patterns
Master the orchestrator-workers pattern, evaluator-optimizer loops, and autonomous agent architectures for complex AI tasks.