Models

Model Catalog

Complete catalog of all supported models on the Skytells platform with pricing, capabilities, and input schemas. Last updated March 17, 2026.

Every model on Skytells has its own namespace, pricing structure, capabilities, and input/output schema. This page is a complete reference for all currently supported models, grouped by type.

Use the namespace value when making API calls. Text models use the Inference API endpoints (/v1/chat/completions, /v1/responses). All other models use the Predictions API endpoint (/v1/predictions).

# Text models — Inference API
curl https://api.skytells.ai/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepbrain-router", "messages": [{"role": "user", "content": "Hello!"}]}'

# Image / Video / Audio models — Predictions API
curl -X POST https://api.skytells.ai/v1/predictions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "truefusion-pro", "input": {"prompt": "A sunset over mountains"}}'

Text Models

Text models are accessed via the Inference API, which is fully OpenAI-compatible. You can use the OpenAI SDK by pointing base_url to https://api.skytells.ai/v1.

GPT-5

PropertyValue
Namespacegpt-5
VendorOpenAI
Pricing$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilitiestext-to-text, coding, writing, reasoning, chat, analysis, summarization, instruction-following, problem-solving
StatusOperational
OpenAI Compatible

OpenAI's flagship fifth-generation model, offering state-of-the-art reasoning, coding, and instruction-following capabilities.

ParameterTypeRequiredDefaultDescription
modelstringMust be gpt-5
messagesarrayArray of {role, content} objects. Roles: system, user, assistant
streambooleanfalseEnable server-sent events streaming
max_tokensinteger8192Maximum tokens to generate
temperaturenumber0.7Sampling temperature (0–2)
top_pnumber0.95Nucleus sampling probability (0–1)
frequency_penaltynumber0.0Penalise token frequency (-2.0–2.0)
presence_penaltynumber0.0Penalise new topics (-2.0–2.0)
stopstring | arrayStop sequences (up to 4)
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-5",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}
Token TypePrice
Input tokens$0.50 / 1M tokens
Output tokens$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.


GPT-5.4

PropertyValue
Namespacegpt-5.4
VendorOpenAI
Pricing$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilitiestext-to-text, coding, writing, reasoning, chat, analysis, summarization, instruction-following, problem-solving
StatusOperational
OpenAI Compatible

An incremental update to GPT-5 with improved accuracy, stronger instruction adherence, and refined reasoning in complex multi-turn conversations.

ParameterTypeRequiredDefaultDescription
modelstringMust be gpt-5.4
messagesarrayArray of {role, content} objects. Roles: system, user, assistant
streambooleanfalseEnable server-sent events streaming
max_tokensinteger8192Maximum tokens to generate
temperaturenumber0.7Sampling temperature (0–2)
top_pnumber0.95Nucleus sampling probability (0–1)
frequency_penaltynumber0.0Penalise token frequency (-2.0–2.0)
presence_penaltynumber0.0Penalise new topics (-2.0–2.0)
stopstring | arrayStop sequences (up to 4)
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}
Token TypePrice
Input tokens$0.50 / 1M tokens
Output tokens$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.


GPT-5.3 Codex

PropertyValue
Namespacegpt-5.3-codex
VendorOpenAI
Pricing$1.75 / 1M input tokens · $0.175 / 1M cached input tokens · $14 / 1M output tokens
Capabilitiestext-to-text, coding, writing, reasoning, chat, analysis, problem-solving, quality
StatusOperational
OpenAI Compatible
Edge Compatible

GPT-5.3 Codex achieves state-of-the-art performance on SWE-Bench Pro, a rigorous evaluation of real-world software engineering. SWE-Bench Pro spans four languages and is more contamination-resistant, challenging, diverse, and industry-relevant than SWE-bench Verified (which only tests Python). It also far exceeds the previous state-of-the-art on Terminal-Bench 2.0, which measures terminal skills for coding agents. GPT-5.3 Codex does so with fewer tokens than prior models, so you can build more per dollar.

ParameterTypeRequiredDefaultDescription
modelstringMust be gpt-5.3-codex
inputstring | arrayUser prompt as a string, or an array of chat-style message objects (same idea as /v1/chat/completions messages)
instructionsstringSystem-level instructions (equivalent to a system message)
previous_response_idstringContinue a prior turn without resending full history
streambooleanfalseStream the response as SSE
max_output_tokensinteger8192Maximum tokens to generate
temperaturenumber0.7Sampling temperature (0–2)
top_pnumber0.95Nucleus sampling (0–1)

See the Responses API reference for the full schema.

{
  "id": "resp_...",
  "object": "response",
  "model": "gpt-5.3-codex",
  "status": "completed",
  "output_text": "..."
}

Shape follows the OpenAI Responses object; see Responses — Response object for all fields.

Token TypePrice
Input tokens$1.75 / 1M tokens
Cached input tokens$0.175 / 1M tokens
Output tokens$14 / 1M tokens

Billed separately for input, cached input, and output. See Pricing for details.


DeepBrain Router

PropertyValue
Namespacedeepbrain-router
VendorSkytells ✓
Pricing$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilitiestext-to-text, coding, writing, reasoning, chat, analysis, summarization, instruction-following, problem-solving, quality, fast, router
StatusOperational
OpenAI Compatible
Edge Compatible
Cold BootNo

DeepBrain Router is Skytells' advanced model orchestration layer, built to intelligently choose the right model for the right task. Optimized for coding, writing, reasoning, and complex multi-domain workloads, it dynamically routes requests across a curated set of flagship models from leading providers. The result is stronger output quality, improved cost-performance balance, and a more reliable AI experience at scale.

ParameterTypeRequiredDefaultDescription
modelstringMust be deepbrain-router
messagesarrayArray of {role, content} objects. Roles: system, user, assistant
streambooleanfalseEnable server-sent events streaming
max_tokensinteger8192Maximum tokens to generate
temperaturenumber0.7Sampling temperature (0–2). Lower = more deterministic
top_pnumber0.95Nucleus sampling probability (0–1)
frequency_penaltynumber0.0Penalise token frequency (-2.0–2.0)
presence_penaltynumber0.0Penalise new topics (-2.0–2.0)
stopstring | arrayStop sequences (up to 4)

Unlike traditional LLMs, Reasoning models, and other flagship models, DeepBrain Router model works differently. It dynamically selects the best underlying model for each request, so you get consistently high quality without managing model selection yourself, which advances AI experiance and Agentic workflows with less overhead and more flexibility.

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "deepbrain-router",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

DeepBrain Router dynamically selects from the following models based on task complexity, domain, and quality requirements:

ModelVersionProvider
DeepBrain MiniSkytells
DeepBrain 2.0Skytells
GPT-4o2024-11-20OpenAI
GPT-4o Mini2024-07-18OpenAI
GPT-4.1 Nano2025-04-14OpenAI
GPT-4.1 Mini2025-04-14OpenAI
GPT-4.12025-04-14OpenAI
o4-mini2025-04-16OpenAI
GPT-5 Nano2025-08-07OpenAI
GPT-5 Mini2025-08-07OpenAI
GPT-5 Chat2025-08-07OpenAI
GPT-52025-08-07OpenAI
GPT-5.2 Chat2025-12-11OpenAI
GPT-5.22025-12-11OpenAI
GPT-OSS 120BOpenAI
Llama-4 Maverick 17B-128E (FP8)Meta
DeepSeek-V3.1DeepSeek
DeepSeek-V3.2DeepSeek
Grok-4xAI
Grok-4 Fast ReasoningxAI
Claude Haiku 4.520251001Anthropic
Claude Sonnet 4.520250929Anthropic
Claude Opus 4.120250805Anthropic
Claude Opus 4.6Anthropic

You cannot control which model is selected. The router optimises for quality and task fit. All routed requests are billed at the DeepBrain Router rate regardless of the underlying model chosen.

Token TypePrice
Input tokens$0.50 / 1M tokens
Output tokens$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.


Image Models

TrueFusion

PropertyValue
Namespacetruefusion
VendorSkytells
Pricing$0.03 / image
Capabilitiestext-to-image
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringText prompt for generation
aspect_ratioenum1:11:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16, 21:9
number_of_imagesinteger1Number of images (1–9)
prompt_optimizerbooleantrueUse prompt optimizer
{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.


TrueFusion Pro

PropertyValue
Namespacetruefusion-pro
VendorSkytells
Pricing$0.05 / image
Capabilitiestext-to-image, image-to-image
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringPrompt for generated image
aspect_ratioenum1:11:1, 16:9, 21:9, 3:2, 2:3, 4:5, 5:4, 3:4, 4:3, 9:16, 9:21
imagestring (URI)Input image for img2img mode
prompt_strengthnumber0.8Prompt strength for img2img (0–1)
num_outputsinteger1Number of outputs (1–4)
num_inference_stepsinteger28Denoising steps (1–50). Recommended: 28–50
guidancenumber3Guidance scale (0–10)
seedintegerRandom seed for reproducibility
output_formatenumwebpwebp, jpg, png
output_qualityinteger80Output quality (0–100)
go_fastbooleantrueUse fp8 quantized model for speed
megapixelsenum11, 0.25
disable_safety_checkerbooleanfalseDisable safety checker
{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.


TrueFusion Max

PropertyValue
Namespacetruefusion-max
VendorSkytells
Pricing$0.12 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringText prompt for image generation
aspect_ratioenum1:111 ratios including 21:9, 9:21
image_promptstring (URI)Image reference to guide composition
image_prompt_strengthnumber0.1Blend between prompt and image (0–1)
safety_toleranceinteger2Safety level (1=strict, 6=permissive)
seedintegerRandom seed
rawbooleanfalseGenerate less processed, more natural images
output_formatenumjpgjpg, png
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


TrueFusion Ultra

PropertyValue
Namespacetruefusion-ultra
VendorSkytells
Pricing$0.15 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational

Flagship model with stunning photorealism, artistic creativity, and unmatched consistency across styles. Supports inpainting, style references, and magic prompt optimization.

ParameterTypeRequiredDefaultDescription
promptstringText prompt for image generation
aspect_ratioenum1:115 ratios including 1:3, 3:1
resolutionenumNoneSpecific resolution override (e.g., 1024x1024, 1536x640)
magic_prompt_optionenumAutoAuto, On, Off — optimizes prompt for quality
imagestring (URI)Image for inpainting (requires mask)
maskstring (URI)Black/white mask for inpainting
style_typeenumNoneAuto, General, Realistic, Design
style_reference_imagesarray (URI)Style reference images
seedintegerRandom seed (max 2,147,483,647)
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


TrueFusion 2.0

PropertyValue
Namespacetruefusion-2
VendorSkytells
Pricing$0.07–$0.10 / image (resolution-based)
Capabilitiestext-to-image, image-to-image, reference, quality
StatusOperational

Attach up to 3 reference images as ground truth and tag them in your prompt using @tag_name. Preserves identity, style, and materials while giving control over angle, composition, and lighting.

ParameterTypeRequiredDefaultDescription
promptstringText prompt for image generation
aspect_ratioenum16:916:9, 9:16, 4:3, 3:4, 1:1, 21:9
resolutionenum1080p720p, 1080p
reference_imagesarray (URI)[]Up to 3 reference images (0.5–2 aspect ratio)
reference_tagsarray (string)[]Tags for references (use @tag_name in prompt)
seedintegerRandom seed
remix_idstringRemix from another Skytells prediction
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

ResolutionPrice
720p$0.07 / image
1080p$0.10 / image

TrueFusion Optima

PropertyValue
Namespacetruefusion-2-optima
VendorSkytells
Pricing$0.008 / computing second
Capabilitiestext-to-image, image-to-image, quality
StatusOperational
Cold BootYes (CPU-based deployment)

Next-generation MoE architecture delivering unmatched realism, lifelike lighting, and film-grade image precision. Billed by compute time rather than per-image.

ParameterTypeRequiredDefaultDescription
promptstringText prompt for image generation
aspect_ratioenum1:11:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 1:3, 3:1
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


TrueFusion X

PropertyValue
Namespacetruefusion-x
VendorSkytells
Pricing$0.10 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational

Ultra-fast, ultra-high-resolution with inpainting support and quality control.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
nnumber1Number of images
imagearray (URI)Reference images
maskstring (URI)Inpainting mask
qualityenummediumlow, medium, high
sizeenum1024x1024768x1152, 1152x768, 1792x1024, 1920x822, etc.
output_formatenumjpgjpg, png
output_compressionnumber100Compression ratio (1–100)
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


TrueFusion Edge

PropertyValue
Namespacetruefusion-edge
VendorSkytells
Pricing$0.01 / image
Capabilitiestext-to-image, image-to-image, fast
StatusOperational

Ultra-fast, lightweight model optimized for speed. Only 4 denoising steps needed.

ParameterTypeRequiredDefaultDescription
promptstringPrompt for generated image
aspect_ratioenum1:111 ratios
num_outputsinteger1Number of outputs (1–4)
num_inference_stepsinteger4Denoising steps (1–4)
seedintegerRandom seed
output_formatenumwebpwebp, jpg, png
output_qualityinteger80Quality (0–100)
go_fastbooleantrueUse fp8 quantized model
megapixelsenum11, 0.25
disable_safety_checkerbooleanfalseDisable safety checker
{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.


TrueFusion Pano

PropertyValue
Namespacetruefusion-pano
VendorSkytells
Pricing$0.02 / GPU second
Capabilitiestext-to-image, image-to-image
StatusOperational

Multiple model variants available including multilingual (English + Chinese).

ParameterTypeRequiredDefaultDescription
promptstringInput prompt
negative_promptstring""Things to exclude
model_variantenum1600M-1024px1600M-1024px, 1600M-1024px-multilang, 1600M-512px, 600M-1024px-multilang, 600M-512px-multilang
widthinteger1024Output width
heightinteger1024Output height
num_inference_stepsinteger18Denoising steps
guidance_scalenumber5CFG scale (1–20)
pag_guidance_scalenumber2PAG guidance (1–20)
seedintegerRandom seed
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


TrueFusion Standard

PropertyValue
Namespacetruefusion-standard
VendorSkytells
Pricing$0.05 / image
Capabilitiestext-to-image
StatusOperational

Full-featured model with LoRA support. Load custom LoRA weights from Skytells, HuggingFace, or CivitAI.

ParameterTypeRequiredDefaultDescription
promptstringPrompt for generated image
aspect_ratioenum1:111 ratios
imagestring (URI)Input image for img2img
prompt_strengthnumber0.8Prompt strength for img2img (0–1)
num_outputsinteger1Number of outputs (1–4)
num_inference_stepsinteger28Denoising steps (1–50)
guidancenumber3Guidance (0–10)
lora_weightsstringLoRA weights URL or identifier (e.g., skytells/truefusion-base)
lora_scalenumber1LoRA strength (-1 to 3)
seedintegerRandom seed
output_formatenumwebpwebp, jpg, png
output_qualityinteger80Quality (0–100)
go_fastbooleantrueUse fp8 quantized model
megapixelsenum11, 0.25
disable_safety_checkerbooleanfalseDisable safety checker
{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.


TrueFusion Variant

PropertyValue
Namespacetruefusion-variant
VendorSkytells
Pricing$0.05 / image
Capabilitiestext-to-image, image-to-image
StatusOperational

Schema not yet published — contact support for input details.


Flux.1 Edge

PropertyValue
Namespaceflux-fast
VendorSkytells
Pricing$0.01 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational

Super-fast Flux model optimized by Skytells for instant generation, with configurable speed modes.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
speed_modeenumExtra Juiced 🔥Speed optimization level
guidancenumber3.5Guidance scale
image_sizeinteger1024Base image size (longest side)
aspect_ratioenum1:111 ratios
output_formatenumjpgpng, jpg, webp
output_qualityinteger80Quality (1–100)
num_inference_stepsinteger28Inference steps
seedinteger-1Random seed
{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.


Imagen 3

PropertyValue
Namespacegoogle-imagen-3
VendorGoogle
Pricing$0.08 / image
Capabilitiestext-to-image, image-to-image
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringText prompt
negative_promptstringWhat to discourage
aspect_ratioenum1:11:1, 9:16, 16:9, 3:4, 4:3
safety_filter_levelenumblock_medium_and_aboveblock_low_and_above, block_medium_and_above, block_only_high
{
  "type": "string",
  "format": "uri"
}

Imagen 4

PropertyValue
Namespacegoogle-imagen-4
VendorGoogle
Pricing$0.08 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational

Google's flagship text-to-image model.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
aspect_ratioenum1:11:1, 9:16, 16:9, 3:4, 4:3
safety_filter_levelenumblock_medium_and_aboveblock_low_and_above, block_medium_and_above, block_only_high
{
  "type": "string",
  "format": "uri"
}

Nano Banana

PropertyValue
Namespacenano-banana
VendorGoogle
Pricing$0.06 / image
Capabilitiestext-to-image, image-to-image, fast
StatusOperational

Google's Gemini 2.5-based image editing model with multi-image input support.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
image_inputarray (URI)[]Input images for transformation (multiple supported)
aspect_ratioenummatch_input_imagematch_input_image, 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
output_formatenumjpgjpg, png
{
  "type": "string",
  "format": "uri"
}

Sana

PropertyValue
Namespacenvidia-sana
VendorNvidia
Pricing$0.05 / image
Capabilitiestext-to-image
StatusOperational

Fast image model with wide artistic range and resolutions up to 4096×4096. Multiple model variants including multilingual.

ParameterTypeRequiredDefaultDescription
promptstringInput prompt
negative_promptstring""Things to exclude
model_variantenum1600M-1024px1600M-1024px, 1600M-1024px-multilang, 1600M-512px, 600M-1024px-multilang, 600M-512px-multilang
widthinteger1024Output width
heightinteger1024Output height
num_inference_stepsinteger18Denoising steps
guidance_scalenumber5CFG scale (1–20)
pag_guidance_scalenumber2PAG guidance (1–20)
seedintegerRandom seed
{
  "type": "string",
  "format": "uri"
}

GPT-Image-1

PropertyValue
Namespacegpt-image-1
VendorOpenAI
Pricing$0.002 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational
InferencePartner-served

Requires your own OpenAI API key.

ParameterTypeRequiredDefaultDescription
openai_api_keystring (password)Your OpenAI API key
promptstringText description
aspect_ratioenum1:11:1, 3:2, 2:3
input_imagesarray (URI)Input images for editing
number_of_imagesinteger1Number of images (1–10)
qualityenumautolow, medium, high, auto
backgroundenumautoauto, transparent, opaque
output_formatenumwebppng, jpeg, webp
output_compressioninteger90Compression (0–100%)
moderationenumautoauto, low
user_idstringEnd-user identifier for abuse monitoring
{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.


FLUX-1.1 Pro

PropertyValue
NamespaceFLUX-1.1-pro
VendorBlack Forest Labs
Pricing$0.04 / image
Capabilitiestext-to-image, image-to-image, quality
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringText prompt
nnumber1Number of images
sizeenum1024x10241024x1024, 768x1152, 1152x768, 1792x1024, etc.
output_formatenumjpgjpg, png
{
  "type": "string",
  "format": "uri"
}

FLUX.2 Pro

PropertyValue
NamespaceFLUX.2-pro
VendorBlack Forest Labs
Pricing$0.02 / image megapixel
Capabilitiestext-to-image, image-to-image, quality
StatusOperational

Megapixel-based pricing. Supports up to 8 input reference images.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
input_imagesarray (URI)[]Up to 8 reference images
aspect_ratioenum1:1match_input_image, custom, plus standard ratios
resolutionenum1 MPmatch_input_image, 0.5 MP, 1 MP, 2 MP, 4 MP
widthintegerCustom width (256–2048, multiples of 32)
heightintegerCustom height (256–2048, multiples of 32)
safety_toleranceinteger2Safety level (1–5)
seedintegerRandom seed
output_formatenumpngwebp, jpg, png
output_qualityinteger100Quality (0–100)
{
  "type": "string",
  "format": "uri"
}
ResolutionPrice per image
0.5 MP$0.01
1 MP$0.02
2 MP$0.04
4 MP$0.08

Flux 2 Pro Legacy

PropertyValue
Namespaceflux-2-pro-legacy
VendorBlack Forest Labs
Pricing$0.02 / image megapixel
Capabilitiestext-to-image, image-to-image, editing, quality
StatusOperational

Supports up to 8 reference images. Same schema as FLUX.2 Pro with additional safety_tolerance control.


Flux 2 Flex

PropertyValue
Namespaceflux-2-flex
VendorBlack Forest Labs
Pricing$0.08 / image megapixel
Capabilitiestext-to-image, image-to-image, editing, quality
StatusOperational

Max-quality generation with up to 10 reference images, configurable inference steps, guidance, and prompt upsampling.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
input_imagesarray (URI)[]Up to 10 reference images
aspect_ratioenum1:1match_input_image, custom, plus standard ratios
resolutionenum1 MP0.5 MP, 1 MP, 2 MP, 4 MP
width / heightintegerCustom dimensions (256–2048)
stepsinteger30Inference steps (1–50)
guidancenumber4.5Guidance scale (1.5–10)
prompt_upsamplingbooleantrueAuto-modify prompt for creativity
safety_toleranceinteger2Safety (1–5)
seedintegerRandom seed
output_formatenumpngwebp, jpg, png
output_qualityinteger100Quality (0–100)
{
  "type": "string",
  "format": "uri"
}

Video Models

TrueFusion Video Pro

PropertyValue
Namespacetruefusion-video-pro
VendorSkytells
Pricing$0.196 / second
Capabilitiestext-to-video, image-to-video
StatusOperational
ParameterTypeRequiredDefaultDescription
promptstringText prompt for video generation
negative_promptstring""Things to exclude
aspect_ratioenum16:916:9, 9:16, 1:1
start_imagestring (URI)First frame of the video
end_imagestring (URI)Last frame of the video
cfg_scalenumber0.5Flexibility scale (0–1). Higher = more constrained
durationenum55, 10 seconds
{
  "type": "string",
  "format": "uri"
}

Returns a video URL.


TrueFusion Video

PropertyValue
Namespacetruefusion-video
VendorSkytells
Pricing$0.112 / second
Capabilitiestext-to-video, image-to-video
StatusOperational

Same schema as TrueFusion Video Pro but without end_image support. Lower cost option.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
negative_promptstring""Things to exclude
aspect_ratioenum16:916:9, 9:16, 1:1
start_imagestring (URI)First frame
cfg_scalenumber0.5Flexibility scale (0–1)
durationenum55, 10 seconds
{
  "type": "string",
  "format": "uri"
}

Mera

PropertyValue
Namespacemera
VendorSkytells
Pricing$3.42 / prediction
Capabilitiestext-to-video, image-to-video, audio, quality
StatusOperational

Skytells's latest video generation model — physically accurate, super realistic, and controllable. Supports reference images for subject consistency.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
secondsenum84, 6, 8, 12 seconds
sizeenum720x1280720x1280, 1280x720
input_referencearray (URI)[]1–3 reference images for R2V
{
  "type": "string",
  "format": "uri"
}

Lumo

PropertyValue
Namespacelumo
VendorSkytells
Pricing$1.12 / prediction
Capabilitiesimage-to-video
StatusOperational

Specialized for motion, animations, and general use cases. Schema not yet published.


LipFusion

PropertyValue
Namespacelipfusion
VendorSkytells
Pricing$0.04 / second
Capabilitiesvideo-to-video, audio-to-video
StatusOperational

Ultra-realistic lip-syncing for videos, animations, avatars, and live streams. Supports audio file input or text-to-speech with 45+ voice presets.

ParameterTypeRequiredDefaultDescription
video_urlstring (URI)Target video (.mp4/.mov, under 100MB, 2–10s, 720p–1080p)
audio_filestring (URI)Audio for lip sync (.mp3/.wav/.m4a/.aac, under 5MB)
textstringText for TTS lip sync (Enterprise only)
voice_idenumen_AOT45+ voice presets (English + Chinese)
motion_awarebooleantrueAdjust sync based on subject movements
voice_speednumber1Speech rate for TTS (0.8–2)
{
  "type": "string",
  "format": "uri"
}

Veo 3.1

PropertyValue
Namespaceveo-3.1
VendorGoogle
Pricing$0.43 / second
Capabilitiestext-to-video, image-to-video, quality
StatusOperational

Google's next-gen video model with context-aware audio, reference images, and last-frame interpolation.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
aspect_ratioenum16:916:9, 9:16
durationenum84, 6, 8 seconds
imagestring (URI)Start image
last_framestring (URI)End image for interpolation
reference_imagesarray (URI)[]1–3 reference images for R2V (16:9, 8s only)
negative_promptstringWhat to exclude
resolutionenum1080p720p, 1080p
generate_audiobooleantrueGenerate audio with video
seedintegerRandom seed
{
  "type": "string",
  "format": "uri"
}

Veo 3.1 Fast

PropertyValue
Namespaceveo-3.1-fast
VendorGoogle
Pricing$0.13–$0.17 / second (audio-dependent)
Capabilitiestext-to-video, image-to-video, fast
StatusOperational

Faster variant of Veo 3.1 with dynamic pricing based on audio generation.

ParameterTypeRequiredDefaultDescription
promptstringText prompt
aspect_ratioenum16:916:9, 9:16
durationenum84, 6, 8 seconds
imagestring (URI)Start image
last_framestring (URI)End image for interpolation
negative_promptstringWhat to exclude
resolutionenum1080p720p, 1080p
generate_audiobooleantrueGenerate audio with video
seedintegerRandom seed
{
  "type": "string",
  "format": "uri"
}
Audio GenerationPrice per second
Enabled$0.17
Disabled$0.13

Veo 3.1 (Preview)

PropertyValue
Namespaceveo-3.1-preview
VendorGoogle
Pricing$0.43 / second
Capabilitiestext-to-video, image-to-video, quality, sound
StatusOperational

Preview version with person_generation safety control. Same schema as Veo 3.1 with additional person generation options (allow_adult, dont_allow).


Sora 2

PropertyValue
Namespacesora-2
VendorOpenAI
Pricing$0.002 / video
Capabilitiestext-to-video, image-to-video
StatusOperational
InferencePartner-served

Requires your own OpenAI API key.

ParameterTypeRequiredDefaultDescription
openai_api_keystring (password)Your OpenAI API key
promptstringText description of the video
secondsenum44, 8, 12
aspect_ratioenumportraitportrait (720×1280), landscape (1280×720)
input_referencestring (URI)Reference image or video
{
  "type": "string",
  "format": "uri"
}

Sora 2 Pro

PropertyValue
Namespacesora-2-pro
VendorOpenAI
Pricing$0.002 / video
Capabilitiestext-to-video, image-to-video
StatusOperational
InferencePartner-served

Same as Sora 2 with additional resolution control (standard = 720p, high = 1024p). Requires your own OpenAI API key.


Wan 2.5-i2v

PropertyValue
Namespacewan-2.5-i2v
VendorAlibaba
Pricing$0.06–$0.16 / second (resolution-based)
Capabilitiesimage-to-video, reference, quality, audio, video
StatusOperational

Alibaba's image-to-video model with background audio, prompt expansion, and multi-resolution support.

ParameterTypeRequiredDefaultDescription
imagestring (URI)Input image
promptstringText prompt
negative_promptstring""What to exclude
resolutionenum720p480p, 720p, 1080p
durationenum55, 10 seconds
audiostring (URI)Audio file for sync (wav/mp3, 3–30s, ≤15MB)
enable_prompt_expansionbooleantrueEnable prompt optimizer
seedintegerRandom seed
{
  "type": "string",
  "format": "uri"
}
ResolutionPrice per second
480p$0.06
720p$0.11
1080p$0.16

Video Upscale

PropertyValue
Namespacevideo-upscale
VendorTopaz Labs
Pricing$0.10 / 5 seconds
Capabilitiesvideo-to-video
StatusOperational
ParameterTypeRequiredDefaultDescription
videostring (URI)Video file to upscale
target_resolutionenum1080p720p, 1080p, 4k
target_fpsinteger30Target FPS (15–60)
{
  "type": "string",
  "format": "uri"
}

Audio Models

BeatFusion 2.0

PropertyValue
Namespacebeatfusion-2.0
VendorSkytells
Pricing$0.75 / prediction
Capabilitiestext-to-audio, music, quality, audio
StatusOperational

Skytells's flagship music generation model. Generate full-length songs with vocals, lyrics, and rich instrumentation. Supports structure tags: [Intro], [Verse], [Pre Chorus], [Chorus], [Bridge], [Outro], [Hook], [Solo], and more.

ParameterTypeRequiredDefaultDescription
lyricsstringLyrics with structure tags (1–3500 chars). Use \n for line breaks
promptstring""Music style description (0–2000 chars)
sample_rateenum4410016000, 24000, 32000, 44100
bitrateenum25600032000, 64000, 128000, 256000
audio_formatenummp3mp3, wav, pcm
{
  "type": "string",
  "format": "uri"
}

Returns an audio file URL.


BeatFusion 1.0

PropertyValue
Namespacebeatfusion-1.0
VendorSkytells
Pricing$0.45 / prediction
Capabilitiestext-to-audio, music, quality, audio
StatusOperational

First-generation music model. Supports [intro], [verse], [chorus], [bridge], [outro] tags.

ParameterTypeRequiredDefaultDescription
promptstringMusic style description (10–300 chars)
lyricsstringLyrics with structure tags (10–600 chars)
sample_rateenum4410016000, 24000, 32000, 44100
bitrateenum25600032000, 64000, 128000, 256000
audio_formatenummp3mp3, wav, pcm
{
  "type": "string",
  "format": "uri"
}

Understanding Model Schemas

Every model on Skytells defines its own input_schema and output_schema. These schemas follow JSON Schema conventions and describe exactly what parameters a model accepts and what it returns.

Why schemas differ

Different models are built for different tasks, so their inputs vary significantly:

  • Simple models like TrueFusion only need a prompt and optional aspect_ratio
  • Advanced models like TrueFusion Pro add controls for guidance, num_inference_steps, seed, and output format
  • Flagship models like TrueFusion Ultra support inpainting (image + mask), style references, and resolution presets
  • Reference models like TrueFusion 2.0 accept tagged reference images you can invoke by name in your prompt
  • Video models add duration, start_image/end_image, and cfg_scale for temporal control
  • Audio models use lyrics and prompt for compositional control with structure tags
  • Partner models (GPT-Image-1, Sora 2) require your own API key via openai_api_key

Common input parameters

Most image models share these parameters:

ParameterDescription
promptAlways required. The text description of what to generate
aspect_ratioControls image dimensions. Available ratios vary by model
seedFor reproducible generations. Set the same seed to get identical outputs
output_formatUsually webp, jpg, or png

Output schema patterns

Models return one of two patterns:

Single output — returns one URL:

{ "type": "string", "format": "uri" }

Multiple outputs — returns an array of URLs:

{ "type": "array", "items": { "type": "string", "format": "uri" } }

Models that support num_outputs or number_of_images typically use the array pattern.

Pricing units

UnitDescription
imageFlat rate per generated image
secondBilled per second of generated video/audio
predictionFlat rate per API call
gpuBilled per GPU second used
computing_secondBilled per compute second
image_megapixelBilled by output resolution in megapixels
5 secondsBilled per 5-second chunk

How is this guide?

On this page