Creating AI Videos
Generate your first AI video, handle long generation times with webhooks, save and serve outputs — with complete code examples.
What you'll be able to do after this module
Generate cinematic AI videos, handle the async nature of video generation correctly (webhooks, not polling), and build a complete video generation endpoint.
Video vs. image generation
| Aspect | Image | Video |
|---|---|---|
| Typical generation time | 5–15s | 30s – 5min |
| Output format | PNG/JPG URL | MP4 URL |
| Recommended pattern | Sync + poll | Webhooks |
| Output CDN expiry | 24 hours | 24 hours |
Never use synchronous polling with a 2-second interval for video. It holds a connection open for minutes and burns through your polling quota. Use webhooks for any generation > 30 seconds.
Your first video prediction
curl -X POST https://api.skytells.ai/v1/predictions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "truefusion-video-pro",
"input": {
"prompt": "A golden retriever running through autumn leaves in a park, slow motion, cinematic, 4K",
"duration_seconds": 5,
"aspect_ratio": "16:9",
"fps": 24
},
"webhook": "https://yourapp.com/api/webhooks/skytells",
"webhook_events_filter": ["completed"]
}'The immediate response:
{
"id": "pred_vid_abc123",
"status": "queued",
"model": "truefusion-video-pro",
"output": null
}Video models overview
| Model | Avg time | Cost | Best for |
|---|---|---|---|
truefusion-video-pro | 60s | $1.50 | Best Skytells quality |
truefusion-video | 45s | $0.80 | Standard, faster |
mera | 2–5min | $2.00 | Cinematic quality |
lumo | 60s | $1.20 | Artistic/stylized |
lipfusion | 30–60s | $0.60 | Lip-sync (face + audio) |
veo-3.1 | 2–4min | $3.00 | Google Veo quality |
sora-2 | 2–3min | $2.50 | OpenAI Sora style |
Input parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | Required. Describe the video |
duration_seconds | int | Length: 3–30 depending on model |
aspect_ratio | string | "16:9", "9:16", "1:1", "4:3" |
fps | int | Frame rate: 24 (cinematic), 30 (smooth) |
negative_prompt | string | What to avoid in the video |
Polling manually (for simple scripts/testing)
For scripts where you just need the result and don't need webhooks:
import time
import skytells, os
client = skytells.Client(api_key=os.environ["SKYTELLS_API_KEY"])
def generate_video(prompt: str, duration: int = 5) -> str:
"""Generate a video and wait for it. Returns the MP4 URL."""
prediction = client.predictions.create(
model="truefusion-video-pro",
input={"prompt": prompt, "duration_seconds": duration, "aspect_ratio": "16:9"},
wait=False,
)
print(f"Queued: {prediction.id}")
while prediction.status not in ("succeeded", "failed", "canceled"):
time.sleep(5)
prediction = client.predictions.get(prediction.id)
print(f" [{prediction.status}] {prediction.id}")
if prediction.status != "succeeded":
raise RuntimeError(f"Video failed: {prediction.error}")
print(f"Done: {prediction.output[0]}")
return prediction.output[0]
url = generate_video("A drone flyover of a mountain range at golden hour, cinematic")Writing great video prompts
Video prompts need motion cues — static photo descriptions produce still-looking videos.
| Add to prompts | Avoid |
|---|---|
| "slow motion", "timelapse", "dolly shot" | Pure static descriptions |
| "camera panning left", "drone flyover" | No camera direction |
| "cinematic", "4K", "film grain" | "high quality" (too vague) |
| "24fps", "shallow depth of field" | — |
Prompt formula for video:
[Subject doing action] + [camera movement] + [style/mood] + [technical quality]
Example: "A campfire flickering in the forest at night, camera slowly pulling back,
cinematic, film grain, 4K, atmospheric"Saving video outputs
Always download the MP4 before the 24-hour CDN expiry:
import urllib.request
import os
def download_video(cdn_url: str, output_path: str) -> str:
"""Download a video from Skytells CDN to local storage."""
os.makedirs(os.path.dirname(output_path), exist_ok=True)
urllib.request.urlretrieve(cdn_url, output_path)
size_mb = os.path.getsize(output_path) / 1024 / 1024
print(f"Saved {size_mb:.1f}MB → {output_path}")
return output_path
download_video(video_url, "outputs/video_abc123.mp4")Summary
You can now generate AI videos and handle them correctly in production.
Key points for video generation:
- Video takes 30s–5min — always use webhooks in production, not polling loops
- Output CDN URL expires in 24h — download immediately in your webhook handler
- Add motion language to prompts: camera movements, action verbs, timing cues
truefusion-video-prois the right default;veo-3.1for maximum quality
Next: BeatFusion for AI-generated music and LipFusion for audio-synced video.