Python SDK Deep Dive

Master the Skytells Python SDK — sync and async predictions, typed errors, concurrent generation, webhooks, and production patterns.

What you'll be able to do after this module

Write clean, production-ready Python code that creates predictions, handles errors gracefully, runs generations in parallel, and integrates webhooks — without touching urllib directly.

Installation

pip install skytells

The Python SDK has zero external dependencies — it uses only Python's standard library (urllib, json, os). You can add it to any project without dependency conflicts or virtual environment issues.

Your first prediction

import skytells
import os

client = skytells.Client(api_key=os.environ["SKYTELLS_API_KEY"])

prediction = client.predictions.create(
    model="truefusion-pro",
    input={
        "prompt": "A photorealistic wolf in a snowy forest, cinematic lighting",
        "width": 1024,
        "height": 1024,
    },
)

print(prediction.status)   # "succeeded"
print(prediction.output)   # ["https://cdn.skytells.ai/..."]

The SDK automatically polls until the prediction completes. For fast models, this is instantaneous. For slower ones, it runs a polling loop internally so your code stays synchronous and clean.

Client configuration

import skytells

client = skytells.Client(
    api_key=os.environ["SKYTELLS_API_KEY"],
    base_url="https://api.skytells.ai/v1",  # default
    timeout=120,       # seconds (default: 60)
    max_retries=3,     # auto-retry on 5xx errors
)

For the Edge API, set base_url="https://edge.skytells.ai/v1". Edge is available on Business and Enterprise plans for supported models only (truefusion-edge, flux-1-edge).

Use the client as a context manager for automatic resource cleanup:

with skytells.Client(api_key=os.environ["SKYTELLS_API_KEY"]) as client:
    prediction = client.predictions.create(
        model="truefusion-edge",
        input={"prompt": "A quick preview sketch", "width": 512, "height": 512},
    )
    print(prediction.output[0])

Prediction patterns

Wait for completion (default — most use cases)

# SDK blocks until succeeded or failed
prediction = client.predictions.create(
    model="truefusion-pro",
    input={"prompt": "A sunset over the Sahara", "width": 1024, "height": 1024},
)

assert prediction.status == "succeeded"
output_url = prediction.output[0]

Non-blocking (for webhooks, long-running jobs)

# Returns immediately with status="queued" or "processing"
prediction = client.predictions.create(
    model="truefusion-video-pro",
    input={"prompt": "Time-lapse of a city at night", "duration_seconds": 10},
    wait=False,
)
print(f"Job queued: {prediction.id}")
# Register a webhook URL to receive the result instead of polling

Manual polling (when you need to show progress)

import time

prediction_id = "pred_abc123"

while True:
    prediction = client.predictions.get(prediction_id)
    print(f"[{prediction.status}] {prediction.id}")

    if prediction.status in ("succeeded", "failed", "canceled"):
        break
    time.sleep(3)

if prediction.status == "succeeded":
    print("Output:", prediction.output[0])
else:
    print("Failed:", prediction.error)

Async — run predictions in parallel

Use skytells.AsyncClient with asyncio.gather to run multiple predictions concurrently. This is ~3× faster than running them sequentially.

import asyncio
import skytells
import os

async def generate_variations(prompts: list[str]) -> list[str]:
    async with skytells.AsyncClient(api_key=os.environ["SKYTELLS_API_KEY"]) as client:
        tasks = [
            client.predictions.create(
                model="truefusion-pro",
                input={"prompt": p, "width": 1024, "height": 1024},
            )
            for p in prompts
        ]
        predictions = await asyncio.gather(*tasks)
        return [p.output[0] for p in predictions]

# Example: generate 4 variations in parallel
prompts = [
    "A red fox in snow, morning light",
    "A red fox in snow, golden hour",
    "A red fox in snow, moonlight",
    "A red fox in snow, overcast",
]

urls = asyncio.run(generate_variations(prompts))
for url in urls:
    print(url)

Performance tip: 4 parallel predictions complete in roughly the same time as 1 sequential prediction. Use asyncio.gather whenever you need multiple variations or batch processing.

Error handling

The SDK raises specific, typed exceptions for each error type. Always catch the most specific exception first:

from skytells.exceptions import (
    AuthenticationError,   # 401 — invalid API key
    RateLimitError,        # 429 — too many requests
    InvalidInputError,     # 422 — bad parameters
    PredictionError,       # prediction itself failed
    SkytellsError,         # base class for all SDK errors
)

def safe_generate(prompt: str) -> str | None:
    try:
        prediction = client.predictions.create(
            model="truefusion-pro",
            input={"prompt": prompt, "width": 1024, "height": 1024},
        )
        return prediction.output[0]

    except AuthenticationError:
        print("❌ Invalid API key — check SKYTELLS_API_KEY")
        return None

    except RateLimitError as e:
        print(f"⏳ Rate limited — retry after {e.retry_after}s")
        time.sleep(e.retry_after + 1)
        return safe_generate(prompt)  # retry once

    except InvalidInputError as e:
        print(f"❌ Bad input: {e.detail}")
        return None

    except PredictionError as e:
        print(f"❌ Prediction failed: {e.message}")
        return None

    except SkytellsError as e:
        print(f"❌ Unexpected SDK error: {e}")
        raise

Do not swallow SkytellsError silently in production. Unexpected errors should be re-raised or logged to your observability system (Sentry, Datadog, etc.) so you can debug them.

Webhooks with the SDK

For long-running predictions (video, high-quality images), register a webhook instead of polling:

# Create prediction — don't wait, webhook will notify
prediction = client.predictions.create(
    model="truefusion-video-pro",
    input={
        "prompt": "Ocean waves crashing at sunset, cinematic",
        "duration_seconds": 8,
        "aspect_ratio": "16:9",
    },
    webhook="https://yourapp.com/api/webhooks/skytells",
    webhook_events_filter=["completed"],
    wait=False,
)

print(f"Queued: {prediction.id}")
# Your webhook handler will receive the result in 30–120 seconds

Managing predictions

# List recent predictions (great for debugging + billing audits)
predictions = client.predictions.list(limit=20)
for p in predictions:
    cost = getattr(p.metrics, 'billing', {}).get('credits_used', 'N/A')
    print(f"{p.id}  {p.status:<12}  model={p.model}  cost=${cost}")

# Filter by status
failed = client.predictions.list(status="failed", limit=10)

# Cancel a running prediction (user navigated away)
client.predictions.cancel("pred_abc123")

# Delete a prediction and its outputs
client.predictions.delete("pred_abc123")

Models API

# Discover all models with live pricing
models = client.models.list()

print(f"{'Model':<25} {'Type':<8} {'$/prediction'}")
print("-" * 50)
for model in models:
    print(
        f"{model.id:<25} {model.type:<8} "
        f"${model.pricing.per_prediction:.4f}"
    )

# Get one model's details
model = client.models.get("truefusion-pro")
print(f"\n{model.id}: {model.description}")
print(f"Cost: ${model.pricing.per_prediction}/prediction")

Production pattern: retry with backoff

import time
import random
from skytells.exceptions import RateLimitError, SkytellsError

def generate_with_retry(
    prompt: str,
    model: str = "truefusion-pro",
    max_attempts: int = 3,
) -> str:
    for attempt in range(max_attempts):
        try:
            prediction = client.predictions.create(
                model=model,
                input={"prompt": prompt, "width": 1024, "height": 1024},
            )
            return prediction.output[0]

        except RateLimitError as e:
            wait = e.retry_after + random.uniform(0, 1)
            print(f"Rate limited on attempt {attempt + 1}. Waiting {wait:.1f}s...")
            time.sleep(wait)

        except SkytellsError:
            if attempt == max_attempts - 1:
                raise
            backoff = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(backoff)

    raise RuntimeError(f"Failed after {max_attempts} attempts")

Summary

You now have a production-ready Python integration. The SDK handles polling, retries, and typed errors so you can focus on your product.

Key patterns:

client.predictions.create() — synchronous by default, returns a completed prediction
wait=False — for webhooks and long-running jobs
skytells.AsyncClient + asyncio.gather — ~3× faster for parallel generation
Specific exceptions — catch AuthenticationError, RateLimitError, InvalidInputError individually
Context manager — with skytells.Client() as client: for clean resource management

Next: the TypeScript SDK for frontend and full-stack developers.

NextTypeScript SDK Deep Dive

On this page