Model Catalog

Complete catalog of all supported models on the Skytells platform with pricing, capabilities, and input schemas. Last updated March 17, 2026.

Every model on Skytells has its own namespace, pricing structure, capabilities, and input/output schema. This page is a complete reference for all currently supported models, grouped by type.

Use the namespace value when making API calls. Text models use the Inference API endpoints (/v1/chat/completions, /v1/responses). All other models use the Predictions API endpoint (/v1/predictions).

# Text models — Inference API
curl https://api.skytells.ai/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepbrain-router", "messages": [{"role": "user", "content": "Hello!"}]}'

# Image / Video / Audio models — Predictions API
curl -X POST https://api.skytells.ai/v1/predictions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "truefusion-pro", "input": {"prompt": "A sunset over mountains"}}'

Text Models

Text models are accessed via the Inference API, which is fully OpenAI-compatible. You can use the OpenAI SDK by pointing base_url to https://api.skytells.ai/v1.

GPT-5

Property	Value
Namespace	`gpt-5`
Vendor	OpenAI
Pricing	$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilities	`text-to-text`, `coding`, `writing`, `reasoning`, `chat`, `analysis`, `summarization`, `instruction-following`, `problem-solving`
Status	Operational
OpenAI Compatible	✅

OpenAI's flagship fifth-generation model, offering state-of-the-art reasoning, coding, and instruction-following capabilities.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `gpt-5`
`messages`	array	✅	—	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
`stream`	boolean	—	`false`	Enable server-sent events streaming
`max_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2)
`top_p`	number	—	`0.95`	Nucleus sampling probability (0–1)
`frequency_penalty`	number	—	`0.0`	Penalise token frequency (-2.0–2.0)
`presence_penalty`	number	—	`0.0`	Penalise new topics (-2.0–2.0)
`stop`	string \| array	—	—	Stop sequences (up to 4)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-5",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

Token Type	Price
Input tokens	$0.50 / 1M tokens
Output tokens	$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.

GPT-5.4

Property	Value
Namespace	`gpt-5.4`
Vendor	OpenAI
Pricing	$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilities	`text-to-text`, `coding`, `writing`, `reasoning`, `chat`, `analysis`, `summarization`, `instruction-following`, `problem-solving`
Status	Operational
OpenAI Compatible	✅

An incremental update to GPT-5 with improved accuracy, stronger instruction adherence, and refined reasoning in complex multi-turn conversations.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `gpt-5.4`
`messages`	array	✅	—	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
`stream`	boolean	—	`false`	Enable server-sent events streaming
`max_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2)
`top_p`	number	—	`0.95`	Nucleus sampling probability (0–1)
`frequency_penalty`	number	—	`0.0`	Penalise token frequency (-2.0–2.0)
`presence_penalty`	number	—	`0.0`	Penalise new topics (-2.0–2.0)
`stop`	string \| array	—	—	Stop sequences (up to 4)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-5.4",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

Token Type	Price
Input tokens	$0.50 / 1M tokens
Output tokens	$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.

GPT-5.4 Mini

Property	Value
Namespace	`gpt-5.4-mini`
Vendor	OpenAI
Pricing	$0.75 / 1M input tokens · $0.075 / 1M cached input tokens · $4.50 / 1M output tokens
Capabilities	`text-to-text`, `language`, `writing`, `reasoning`, `chat`, `analysis`, `problem-solving`, `quality`, `coding`, `responses`, `fast`, `mini`
Status	Operational
OpenAI Compatible	✅
Edge Compatible	✅

GPT-5.4 mini brings the strengths of GPT-5.4 to a faster, more efficient model designed for high-volume workloads.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `gpt-5.4-mini`
`messages`	array	✅	—	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
`stream`	boolean	—	`false`	Enable server-sent events streaming
`max_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2)
`top_p`	number	—	`0.95`	Nucleus sampling probability (0–1)
`frequency_penalty`	number	—	`0.0`	Penalise token frequency (-2.0–2.0)
`presence_penalty`	number	—	`0.0`	Penalise new topics (-2.0–2.0)
`stop`	string \| array	—	—	Stop sequences (up to 4)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-5.4-mini",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

Token Type	Price
Input tokens	$0.75 / 1M tokens
Cached input tokens	$0.075 / 1M tokens
Output tokens	$4.50 / 1M tokens

Billed separately for input, cached input, and output. See Pricing for details.

GPT-5.3 Codex

Property	Value
Namespace	`gpt-5.3-codex`
Vendor	OpenAI
Pricing	$1.75 / 1M input tokens · $0.175 / 1M cached input tokens · $14 / 1M output tokens
Capabilities	`text-to-text`, `coding`, `writing`, `reasoning`, `chat`, `analysis`, `problem-solving`, `quality`
Status	Operational
OpenAI Compatible	✅
Edge Compatible	✅

Use this model with the Responses API — POST /v1/responses. Set "model": "gpt-5.3-codex" in the request body along with input (and optional instructions, previous_response_id, stream, and other Responses request fields).

GPT-5.3 Codex achieves state-of-the-art performance on SWE-Bench Pro, a rigorous evaluation of real-world software engineering. SWE-Bench Pro spans four languages and is more contamination-resistant, challenging, diverse, and industry-relevant than SWE-bench Verified (which only tests Python). It also far exceeds the previous state-of-the-art on Terminal-Bench 2.0, which measures terminal skills for coding agents. GPT-5.3 Codex does so with fewer tokens than prior models, so you can build more per dollar.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `gpt-5.3-codex`
`input`	string \| array	✅	—	User prompt as a string, or an array of chat-style message objects (same idea as `/v1/chat/completions` `messages`)
`instructions`	string	—	—	System-level instructions (equivalent to a `system` message)
`previous_response_id`	string	—	—	Continue a prior turn without resending full history
`stream`	boolean	—	`false`	Stream the response as SSE
`max_output_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2)
`top_p`	number	—	`0.95`	Nucleus sampling (0–1)

See the Responses API reference for the full schema.

{
  "id": "resp_...",
  "object": "response",
  "model": "gpt-5.3-codex",
  "status": "completed",
  "output_text": "..."
}

Shape follows the OpenAI Responses object; see Responses — Response object for all fields.

Token Type	Price
Input tokens	$1.75 / 1M tokens
Cached input tokens	$0.175 / 1M tokens
Output tokens	$14 / 1M tokens

Billed separately for input, cached input, and output. See Pricing for details.

Llama 3.1 8B

Property	Value
Namespace	`llama-3.1-8b`
Vendor	Meta
Pricing	$0.12 / 1M input tokens · $0.12 / 1M cached input tokens · $0.12 / 1M output tokens
Capabilities	`text-to-text`, `language`, `writing`, `chat`, `analysis`, `quality`, `coding`, `responses`, `fast`
Status	Operational
OpenAI Compatible	✅
Edge Compatible	✅

Llama 3.1 8B brings powerful performance in a smaller, more efficient package. With improved multilingual support, tool use, and a 128K context length, it enables sophisticated use cases like interactive agents and compact coding assistants while remaining lightweight and accessible.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `llama-3.1-8b`
`messages`	array	✅	—	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
`stream`	boolean	—	`false`	Enable server-sent events streaming
`max_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2)
`top_p`	number	—	`0.95`	Nucleus sampling probability (0–1)
`frequency_penalty`	number	—	`0.0`	Penalise token frequency (-2.0–2.0)
`presence_penalty`	number	—	`0.0`	Penalise new topics (-2.0–2.0)
`stop`	string \| array	—	—	Stop sequences (up to 4)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "llama-3.1-8b",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

Token Type	Price
Input tokens	$0.12 / 1M tokens
Cached input tokens	$0.12 / 1M tokens
Output tokens	$0.12 / 1M tokens

Billed separately for input, cached input, and output. See Pricing for details.

DeepBrain Router

Property	Value
Namespace	`deepbrain-router`
Vendor	Skytells ✓
Pricing	$0.50 / 1M input tokens · $1.25 / 1M output tokens
Capabilities	`text-to-text`, `coding`, `writing`, `reasoning`, `chat`, `analysis`, `summarization`, `instruction-following`, `problem-solving`, `quality`, `fast`, `router`
Status	Operational
OpenAI Compatible	✅
Edge Compatible	✅
Cold Boot	No

DeepBrain Router is Skytells' advanced model orchestration layer, built to intelligently choose the right model for the right task. Optimized for coding, writing, reasoning, and complex multi-domain workloads, it dynamically routes requests across a curated set of flagship models from leading providers. The result is stronger output quality, improved cost-performance balance, and a more reliable AI experience at scale.

DeepBrain Router is the recommended default for most use cases. It automatically selects the best underlying model for each request, so you get consistently high quality without managing model selection yourself.

Parameter	Type	Required	Default	Description
`model`	string	✅	—	Must be `deepbrain-router`
`messages`	array	✅	—	Array of `{role, content}` objects. Roles: `system`, `user`, `assistant`
`stream`	boolean	—	`false`	Enable server-sent events streaming
`max_tokens`	integer	—	`8192`	Maximum tokens to generate
`temperature`	number	—	`0.7`	Sampling temperature (0–2). Lower = more deterministic
`top_p`	number	—	`0.95`	Nucleus sampling probability (0–1)
`frequency_penalty`	number	—	`0.0`	Penalise token frequency (-2.0–2.0)
`presence_penalty`	number	—	`0.0`	Penalise new topics (-2.0–2.0)
`stop`	string \| array	—	—	Stop sequences (up to 4)

Unlike traditional LLMs, Reasoning models, and other flagship models, DeepBrain Router model works differently. It dynamically selects the best underlying model for each request, so you get consistently high quality without managing model selection yourself, which advances AI experiance and Agentic workflows with less overhead and more flexibility.

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "deepbrain-router",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 25,
    "total_tokens": 35
  }
}

DeepBrain Router dynamically selects from the following models based on task complexity, domain, and quality requirements:

Model	Version	Provider
DeepBrain Mini	—	Skytells
DeepBrain 2.0	—	Skytells
GPT-4o	2024-11-20	OpenAI
GPT-4o Mini	2024-07-18	OpenAI
GPT-4.1 Nano	2025-04-14	OpenAI
GPT-4.1 Mini	2025-04-14	OpenAI
GPT-4.1	2025-04-14	OpenAI
o4-mini	2025-04-16	OpenAI
GPT-5 Nano	2025-08-07	OpenAI
GPT-5 Mini	2025-08-07	OpenAI
GPT-5 Chat	2025-08-07	OpenAI
GPT-5	2025-08-07	OpenAI
GPT-5.2 Chat	2025-12-11	OpenAI
GPT-5.2	2025-12-11	OpenAI
GPT-OSS 120B	—	OpenAI
Llama-4 Maverick 17B-128E (FP8)	—	Meta
DeepSeek-V3.1	—	DeepSeek
DeepSeek-V3.2	—	DeepSeek
Grok-4	—	xAI
Grok-4 Fast Reasoning	—	xAI
Claude Haiku 4.5	20251001	Anthropic
Claude Sonnet 4.5	20250929	Anthropic
Claude Opus 4.1	20250805	Anthropic
Claude Opus 4.6	—	Anthropic

You cannot control which model is selected. The router optimises for quality and task fit. All routed requests are billed at the DeepBrain Router rate regardless of the underlying model chosen.

Token Type	Price
Input tokens	$0.50 / 1M tokens
Output tokens	$1.25 / 1M tokens

Billed separately for input and output. See Pricing for details.

Image Models

TrueFusion

Property	Value
Namespace	`truefusion`
Vendor	Skytells
Pricing	$0.03 / image
Capabilities	`text-to-image`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for generation
`aspect_ratio`	enum	—	`1:1`	`1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `21:9`
`number_of_images`	integer	—	`1`	Number of images (1–9)
`prompt_optimizer`	boolean	—	`true`	Use prompt optimizer

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.

TrueFusion Pro

Property	Value
Namespace	`truefusion-pro`
Vendor	Skytells
Pricing	$0.05 / image
Capabilities	`text-to-image`, `image-to-image`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Prompt for generated image
`aspect_ratio`	enum	—	`1:1`	`1:1`, `16:9`, `21:9`, `3:2`, `2:3`, `4:5`, `5:4`, `3:4`, `4:3`, `9:16`, `9:21`
`image`	string (URI)	—	—	Input image for img2img mode
`prompt_strength`	number	—	`0.8`	Prompt strength for img2img (0–1)
`num_outputs`	integer	—	`1`	Number of outputs (1–4)
`num_inference_steps`	integer	—	`28`	Denoising steps (1–50). Recommended: 28–50
`guidance`	number	—	`3`	Guidance scale (0–10)
`seed`	integer	—	—	Random seed for reproducibility
`output_format`	enum	—	`webp`	`webp`, `jpg`, `png`
`output_quality`	integer	—	`80`	Output quality (0–100)
`go_fast`	boolean	—	`true`	Use fp8 quantized model for speed
`megapixels`	enum	—	`1`	`1`, `0.25`
`disable_safety_checker`	boolean	—	`false`	Disable safety checker

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.

TrueFusion Max

Property	Value
Namespace	`truefusion-max`
Vendor	Skytells
Pricing	$0.12 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for image generation
`aspect_ratio`	enum	—	`1:1`	11 ratios including `21:9`, `9:21`
`image_prompt`	string (URI)	—	—	Image reference to guide composition
`image_prompt_strength`	number	—	`0.1`	Blend between prompt and image (0–1)
`safety_tolerance`	integer	—	`2`	Safety level (1=strict, 6=permissive)
`seed`	integer	—	—	Random seed
`raw`	boolean	—	`false`	Generate less processed, more natural images
`output_format`	enum	—	`jpg`	`jpg`, `png`

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

TrueFusion Ultra

Property	Value
Namespace	`truefusion-ultra`
Vendor	Skytells
Pricing	$0.15 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Flagship model with stunning photorealism, artistic creativity, and unmatched consistency across styles. Supports inpainting, style references, and magic prompt optimization.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for image generation
`aspect_ratio`	enum	—	`1:1`	15 ratios including `1:3`, `3:1`
`resolution`	enum	—	`None`	Specific resolution override (e.g., `1024x1024`, `1536x640`)
`magic_prompt_option`	enum	—	`Auto`	`Auto`, `On`, `Off` — optimizes prompt for quality
`image`	string (URI)	—	—	Image for inpainting (requires mask)
`mask`	string (URI)	—	—	Black/white mask for inpainting
`style_type`	enum	—	`None`	`Auto`, `General`, `Realistic`, `Design`
`style_reference_images`	array (URI)	—	—	Style reference images
`seed`	integer	—	—	Random seed (max 2,147,483,647)

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

TrueFusion 2.0

Property	Value
Namespace	`truefusion-2`
Vendor	Skytells
Pricing	$0.07–$0.10 / image (resolution-based)
Capabilities	`text-to-image`, `image-to-image`, `reference`, `quality`
Status	Operational

Attach up to 3 reference images as ground truth and tag them in your prompt using @tag_name. Preserves identity, style, and materials while giving control over angle, composition, and lighting.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for image generation
`aspect_ratio`	enum	—	`16:9`	`16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `21:9`
`resolution`	enum	—	`1080p`	`720p`, `1080p`
`reference_images`	array (URI)	—	`[]`	Up to 3 reference images (0.5–2 aspect ratio)
`reference_tags`	array (string)	—	`[]`	Tags for references (use `@tag_name` in prompt)
`seed`	integer	—	—	Random seed
`remix_id`	string	—	—	Remix from another Skytells prediction

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

Resolution	Price
720p	$0.07 / image
1080p	$0.10 / image

TrueFusion Optima

Property	Value
Namespace	`truefusion-2-optima`
Vendor	Skytells
Pricing	$0.008 / computing second
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational
Cold Boot	Yes (CPU-based deployment)

Next-generation MoE architecture delivering unmatched realism, lifelike lighting, and film-grade image precision. Billed by compute time rather than per-image.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for image generation
`aspect_ratio`	enum	—	`1:1`	`1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`, `1:3`, `3:1`

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

TrueFusion X

Property	Value
Namespace	`truefusion-x`
Vendor	Skytells
Pricing	$0.10 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Ultra-fast, ultra-high-resolution with inpainting support and quality control.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`n`	number	—	`1`	Number of images
`image`	array (URI)	—	—	Reference images
`mask`	string (URI)	—	—	Inpainting mask
`quality`	enum	—	`medium`	`low`, `medium`, `high`
`size`	enum	—	`1024x1024`	`768x1152`, `1152x768`, `1792x1024`, `1920x822`, etc.
`output_format`	enum	—	`jpg`	`jpg`, `png`
`output_compression`	number	—	`100`	Compression ratio (1–100)

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

TrueFusion Edge

Property	Value
Namespace	`truefusion-edge`
Vendor	Skytells
Pricing	$0.01 / image
Capabilities	`text-to-image`, `image-to-image`, `fast`
Status	Operational

Ultra-fast, lightweight model optimized for speed. Only 4 denoising steps needed.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Prompt for generated image
`aspect_ratio`	enum	—	`1:1`	11 ratios
`num_outputs`	integer	—	`1`	Number of outputs (1–4)
`num_inference_steps`	integer	—	`4`	Denoising steps (1–4)
`seed`	integer	—	—	Random seed
`output_format`	enum	—	`webp`	`webp`, `jpg`, `png`
`output_quality`	integer	—	`80`	Quality (0–100)
`go_fast`	boolean	—	`true`	Use fp8 quantized model
`megapixels`	enum	—	`1`	`1`, `0.25`
`disable_safety_checker`	boolean	—	`false`	Disable safety checker

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.

TrueFusion Pano

Property	Value
Namespace	`truefusion-pano`
Vendor	Skytells
Pricing	$0.02 / GPU second
Capabilities	`text-to-image`, `image-to-image`
Status	Operational

Multiple model variants available including multilingual (English + Chinese).

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Input prompt
`negative_prompt`	string	—	`""`	Things to exclude
`model_variant`	enum	—	`1600M-1024px`	`1600M-1024px`, `1600M-1024px-multilang`, `1600M-512px`, `600M-1024px-multilang`, `600M-512px-multilang`
`width`	integer	—	`1024`	Output width
`height`	integer	—	`1024`	Output height
`num_inference_steps`	integer	—	`18`	Denoising steps
`guidance_scale`	number	—	`5`	CFG scale (1–20)
`pag_guidance_scale`	number	—	`2`	PAG guidance (1–20)
`seed`	integer	—	—	Random seed

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

TrueFusion Standard

Property	Value
Namespace	`truefusion-standard`
Vendor	Skytells
Pricing	$0.05 / image
Capabilities	`text-to-image`
Status	Operational

Full-featured model with LoRA support. Load custom LoRA weights from Skytells, HuggingFace, or CivitAI.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Prompt for generated image
`aspect_ratio`	enum	—	`1:1`	11 ratios
`image`	string (URI)	—	—	Input image for img2img
`prompt_strength`	number	—	`0.8`	Prompt strength for img2img (0–1)
`num_outputs`	integer	—	`1`	Number of outputs (1–4)
`num_inference_steps`	integer	—	`28`	Denoising steps (1–50)
`guidance`	number	—	`3`	Guidance (0–10)
`lora_weights`	string	—	—	LoRA weights URL or identifier (e.g., `skytells/truefusion-base`)
`lora_scale`	number	—	`1`	LoRA strength (-1 to 3)
`seed`	integer	—	—	Random seed
`output_format`	enum	—	`webp`	`webp`, `jpg`, `png`
`output_quality`	integer	—	`80`	Quality (0–100)
`go_fast`	boolean	—	`true`	Use fp8 quantized model
`megapixels`	enum	—	`1`	`1`, `0.25`
`disable_safety_checker`	boolean	—	`false`	Disable safety checker

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.

TrueFusion Variant

Property	Value
Namespace	`truefusion-variant`
Vendor	Skytells
Pricing	$0.05 / image
Capabilities	`text-to-image`, `image-to-image`
Status	Operational

Schema not yet published — contact support for input details.

Flux.1 Edge

Property	Value
Namespace	`flux-fast`
Vendor	Skytells
Pricing	$0.01 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Super-fast Flux model optimized by Skytells for instant generation, with configurable speed modes.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`speed_mode`	enum	—	`Extra Juiced 🔥`	Speed optimization level
`guidance`	number	—	`3.5`	Guidance scale
`image_size`	integer	—	`1024`	Base image size (longest side)
`aspect_ratio`	enum	—	`1:1`	11 ratios
`output_format`	enum	—	`jpg`	`png`, `jpg`, `webp`
`output_quality`	integer	—	`80`	Quality (1–100)
`num_inference_steps`	integer	—	`28`	Inference steps
`seed`	integer	—	`-1`	Random seed

{
  "type": "string",
  "format": "uri"
}

Returns a single image URL.

Imagen 3

Property	Value
Namespace	`google-imagen-3`
Vendor	Google
Pricing	$0.08 / image
Capabilities	`text-to-image`, `image-to-image`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`negative_prompt`	string	—	—	What to discourage
`aspect_ratio`	enum	—	`1:1`	`1:1`, `9:16`, `16:9`, `3:4`, `4:3`
`safety_filter_level`	enum	—	`block_medium_and_above`	`block_low_and_above`, `block_medium_and_above`, `block_only_high`

{
  "type": "string",
  "format": "uri"
}

Imagen 4

Property	Value
Namespace	`google-imagen-4`
Vendor	Google
Pricing	$0.08 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Google's flagship text-to-image model.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`aspect_ratio`	enum	—	`1:1`	`1:1`, `9:16`, `16:9`, `3:4`, `4:3`
`safety_filter_level`	enum	—	`block_medium_and_above`	`block_low_and_above`, `block_medium_and_above`, `block_only_high`

{
  "type": "string",
  "format": "uri"
}

Nano Banana

Property	Value
Namespace	`nano-banana`
Vendor	Google
Pricing	$0.06 / image
Capabilities	`text-to-image`, `image-to-image`, `fast`
Status	Operational

Google's Gemini 2.5-based image editing model with multi-image input support.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`image_input`	array (URI)	—	`[]`	Input images for transformation (multiple supported)
`aspect_ratio`	enum	—	`match_input_image`	`match_input_image`, `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
`output_format`	enum	—	`jpg`	`jpg`, `png`

{
  "type": "string",
  "format": "uri"
}

Sana

Property	Value
Namespace	`nvidia-sana`
Vendor	Nvidia
Pricing	$0.05 / image
Capabilities	`text-to-image`
Status	Operational

Fast image model with wide artistic range and resolutions up to 4096×4096. Multiple model variants including multilingual.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Input prompt
`negative_prompt`	string	—	`""`	Things to exclude
`model_variant`	enum	—	`1600M-1024px`	`1600M-1024px`, `1600M-1024px-multilang`, `1600M-512px`, `600M-1024px-multilang`, `600M-512px-multilang`
`width`	integer	—	`1024`	Output width
`height`	integer	—	`1024`	Output height
`num_inference_steps`	integer	—	`18`	Denoising steps
`guidance_scale`	number	—	`5`	CFG scale (1–20)
`pag_guidance_scale`	number	—	`2`	PAG guidance (1–20)
`seed`	integer	—	—	Random seed

{
  "type": "string",
  "format": "uri"
}

GPT-Image-1

Property	Value
Namespace	`gpt-image-1`
Vendor	OpenAI
Pricing	$0.002 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational
Inference	Partner-served

Requires your own OpenAI API key.

Parameter	Type	Required	Default	Description
`openai_api_key`	string (password)	✅	—	Your OpenAI API key
`prompt`	string	✅	—	Text description
`aspect_ratio`	enum	—	`1:1`	`1:1`, `3:2`, `2:3`
`input_images`	array (URI)	—	—	Input images for editing
`number_of_images`	integer	—	`1`	Number of images (1–10)
`quality`	enum	—	`auto`	`low`, `medium`, `high`, `auto`
`background`	enum	—	`auto`	`auto`, `transparent`, `opaque`
`output_format`	enum	—	`webp`	`png`, `jpeg`, `webp`
`output_compression`	integer	—	`90`	Compression (0–100%)
`moderation`	enum	—	`auto`	`auto`, `low`
`user_id`	string	—	—	End-user identifier for abuse monitoring

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Returns an array of image URLs.

FLUX-1.1 Pro

Property	Value
Namespace	`FLUX-1.1-pro`
Vendor	Black Forest Labs
Pricing	$0.04 / image
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`n`	number	✅	`1`	Number of images
`size`	enum	✅	`1024x1024`	`1024x1024`, `768x1152`, `1152x768`, `1792x1024`, etc.
`output_format`	enum	✅	`jpg`	`jpg`, `png`

{
  "type": "string",
  "format": "uri"
}

FLUX.2 Pro

Property	Value
Namespace	`FLUX.2-pro`
Vendor	Black Forest Labs
Pricing	$0.02 / image megapixel
Capabilities	`text-to-image`, `image-to-image`, `quality`
Status	Operational

Megapixel-based pricing. Supports up to 8 input reference images.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`input_images`	array (URI)	—	`[]`	Up to 8 reference images
`aspect_ratio`	enum	—	`1:1`	`match_input_image`, `custom`, plus standard ratios
`resolution`	enum	—	`1 MP`	`match_input_image`, `0.5 MP`, `1 MP`, `2 MP`, `4 MP`
`width`	integer	—	—	Custom width (256–2048, multiples of 32)
`height`	integer	—	—	Custom height (256–2048, multiples of 32)
`safety_tolerance`	integer	—	`2`	Safety level (1–5)
`seed`	integer	—	—	Random seed
`output_format`	enum	—	`png`	`webp`, `jpg`, `png`
`output_quality`	integer	—	`100`	Quality (0–100)

{
  "type": "string",
  "format": "uri"
}

Resolution	Price per image
0.5 MP	$0.01
1 MP	$0.02
2 MP	$0.04
4 MP	$0.08

Flux 2 Pro Legacy

Property	Value
Namespace	`flux-2-pro-legacy`
Vendor	Black Forest Labs
Pricing	$0.02 / image megapixel
Capabilities	`text-to-image`, `image-to-image`, `editing`, `quality`
Status	Operational

Supports up to 8 reference images. Same schema as FLUX.2 Pro with additional safety_tolerance control.

Flux 2 Flex

Property	Value
Namespace	`flux-2-flex`
Vendor	Black Forest Labs
Pricing	$0.08 / image megapixel
Capabilities	`text-to-image`, `image-to-image`, `editing`, `quality`
Status	Operational

Max-quality generation with up to 10 reference images, configurable inference steps, guidance, and prompt upsampling.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`input_images`	array (URI)	—	`[]`	Up to 10 reference images
`aspect_ratio`	enum	—	`1:1`	`match_input_image`, `custom`, plus standard ratios
`resolution`	enum	—	`1 MP`	`0.5 MP`, `1 MP`, `2 MP`, `4 MP`
`width` / `height`	integer	—	—	Custom dimensions (256–2048)
`steps`	integer	—	`30`	Inference steps (1–50)
`guidance`	number	—	`4.5`	Guidance scale (1.5–10)
`prompt_upsampling`	boolean	—	`true`	Auto-modify prompt for creativity
`safety_tolerance`	integer	—	`2`	Safety (1–5)
`seed`	integer	—	—	Random seed
`output_format`	enum	—	`png`	`webp`, `jpg`, `png`
`output_quality`	integer	—	`100`	Quality (0–100)

{
  "type": "string",
  "format": "uri"
}

Video Models

TrueFusion Video Pro

Property	Value
Namespace	`truefusion-video-pro`
Vendor	Skytells
Pricing	$0.196 / second
Capabilities	`text-to-video`, `image-to-video`
Status	Operational

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt for video generation
`negative_prompt`	string	—	`""`	Things to exclude
`aspect_ratio`	enum	—	`16:9`	`16:9`, `9:16`, `1:1`
`start_image`	string (URI)	—	—	First frame of the video
`end_image`	string (URI)	—	—	Last frame of the video
`cfg_scale`	number	—	`0.5`	Flexibility scale (0–1). Higher = more constrained
`duration`	enum	—	`5`	`5`, `10` seconds

{
  "type": "string",
  "format": "uri"
}

Returns a video URL.

TrueFusion Video

Property	Value
Namespace	`truefusion-video`
Vendor	Skytells
Pricing	$0.112 / second
Capabilities	`text-to-video`, `image-to-video`
Status	Operational

Same schema as TrueFusion Video Pro but without end_image support. Lower cost option.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`negative_prompt`	string	—	`""`	Things to exclude
`aspect_ratio`	enum	—	`16:9`	`16:9`, `9:16`, `1:1`
`start_image`	string (URI)	—	—	First frame
`cfg_scale`	number	—	`0.5`	Flexibility scale (0–1)
`duration`	enum	—	`5`	`5`, `10` seconds

{
  "type": "string",
  "format": "uri"
}

Mera

Property	Value
Namespace	`mera`
Vendor	Skytells
Pricing	$3.42 / prediction
Capabilities	`text-to-video`, `image-to-video`, `audio`, `quality`
Status	Operational

Skytells's latest video generation model — physically accurate, super realistic, and controllable. Supports reference images for subject consistency.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`seconds`	enum	✅	`8`	`4`, `6`, `8`, `12` seconds
`size`	enum	✅	`720x1280`	`720x1280`, `1280x720`
`input_reference`	array (URI)	—	`[]`	1–3 reference images for R2V

{
  "type": "string",
  "format": "uri"
}

Lumo

Property	Value
Namespace	`lumo`
Vendor	Skytells
Pricing	$1.12 / prediction
Capabilities	`image-to-video`
Status	Operational

Specialized for motion, animations, and general use cases. Schema not yet published.

LipFusion

Property	Value
Namespace	`lipfusion`
Vendor	Skytells
Pricing	$0.04 / second
Capabilities	`video-to-video`, `audio-to-video`
Status	Operational

Ultra-realistic lip-syncing for videos, animations, avatars, and live streams. Supports audio file input or text-to-speech with 45+ voice presets.

Parameter	Type	Required	Default	Description
`video_url`	string (URI)	✅	—	Target video (.mp4/.mov, under 100MB, 2–10s, 720p–1080p)
`audio_file`	string (URI)	✅	—	Audio for lip sync (.mp3/.wav/.m4a/.aac, under 5MB)
`text`	string	—	—	Text for TTS lip sync (Enterprise only)
`voice_id`	enum	—	`en_AOT`	45+ voice presets (English + Chinese)
`motion_aware`	boolean	—	`true`	Adjust sync based on subject movements
`voice_speed`	number	—	`1`	Speech rate for TTS (0.8–2)

{
  "type": "string",
  "format": "uri"
}

Veo 3.1

Property	Value
Namespace	`veo-3.1`
Vendor	Google
Pricing	$0.43 / second
Capabilities	`text-to-video`, `image-to-video`, `quality`
Status	Operational

Google's next-gen video model with context-aware audio, reference images, and last-frame interpolation.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`aspect_ratio`	enum	—	`16:9`	`16:9`, `9:16`
`duration`	enum	—	`8`	`4`, `6`, `8` seconds
`image`	string (URI)	—	—	Start image
`last_frame`	string (URI)	—	—	End image for interpolation
`reference_images`	array (URI)	—	`[]`	1–3 reference images for R2V (16:9, 8s only)
`negative_prompt`	string	—	—	What to exclude
`resolution`	enum	—	`1080p`	`720p`, `1080p`
`generate_audio`	boolean	—	`true`	Generate audio with video
`seed`	integer	—	—	Random seed

{
  "type": "string",
  "format": "uri"
}

Veo 3.1 Fast

Property	Value
Namespace	`veo-3.1-fast`
Vendor	Google
Pricing	$0.13–$0.17 / second (audio-dependent)
Capabilities	`text-to-video`, `image-to-video`, `fast`
Status	Operational

Faster variant of Veo 3.1 with dynamic pricing based on audio generation.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Text prompt
`aspect_ratio`	enum	—	`16:9`	`16:9`, `9:16`
`duration`	enum	—	`8`	`4`, `6`, `8` seconds
`image`	string (URI)	—	—	Start image
`last_frame`	string (URI)	—	—	End image for interpolation
`negative_prompt`	string	—	—	What to exclude
`resolution`	enum	—	`1080p`	`720p`, `1080p`
`generate_audio`	boolean	—	`true`	Generate audio with video
`seed`	integer	—	—	Random seed

{
  "type": "string",
  "format": "uri"
}

Audio Generation	Price per second
Enabled	$0.17
Disabled	$0.13

Veo 3.1 (Preview)

Property	Value
Namespace	`veo-3.1-preview`
Vendor	Google
Pricing	$0.43 / second
Capabilities	`text-to-video`, `image-to-video`, `quality`, `sound`
Status	Operational

Preview version with person_generation safety control. Same schema as Veo 3.1 with additional person generation options (allow_adult, dont_allow).

Sora 2

Property	Value
Namespace	`sora-2`
Vendor	OpenAI
Pricing	$0.002 / video
Capabilities	`text-to-video`, `image-to-video`
Status	Operational
Inference	Partner-served

Requires your own OpenAI API key.

Parameter	Type	Required	Default	Description
`openai_api_key`	string (password)	✅	—	Your OpenAI API key
`prompt`	string	✅	—	Text description of the video
`seconds`	enum	—	`4`	`4`, `8`, `12`
`aspect_ratio`	enum	—	`portrait`	`portrait` (720×1280), `landscape` (1280×720)
`input_reference`	string (URI)	—	—	Reference image or video

{
  "type": "string",
  "format": "uri"
}

Sora 2 Pro

Property	Value
Namespace	`sora-2-pro`
Vendor	OpenAI
Pricing	$0.002 / video
Capabilities	`text-to-video`, `image-to-video`
Status	Operational
Inference	Partner-served

Same as Sora 2 with additional resolution control (standard = 720p, high = 1024p). Requires your own OpenAI API key.

Wan 2.5-i2v

Property	Value
Namespace	`wan-2.5-i2v`
Vendor	Alibaba
Pricing	$0.06–$0.16 / second (resolution-based)
Capabilities	`image-to-video`, `reference`, `quality`, `audio`, `video`
Status	Operational

Alibaba's image-to-video model with background audio, prompt expansion, and multi-resolution support.

Parameter	Type	Required	Default	Description
`image`	string (URI)	✅	—	Input image
`prompt`	string	✅	—	Text prompt
`negative_prompt`	string	—	`""`	What to exclude
`resolution`	enum	—	`720p`	`480p`, `720p`, `1080p`
`duration`	enum	—	`5`	`5`, `10` seconds
`audio`	string (URI)	—	—	Audio file for sync (wav/mp3, 3–30s, ≤15MB)
`enable_prompt_expansion`	boolean	—	`true`	Enable prompt optimizer
`seed`	integer	—	—	Random seed

{
  "type": "string",
  "format": "uri"
}

Resolution	Price per second
480p	$0.06
720p	$0.11
1080p	$0.16

Video Upscale

Property	Value
Namespace	`video-upscale`
Vendor	Topaz Labs
Pricing	$0.10 / 5 seconds
Capabilities	`video-to-video`
Status	Operational

Parameter	Type	Required	Default	Description
`video`	string (URI)	✅	—	Video file to upscale
`target_resolution`	enum	—	`1080p`	`720p`, `1080p`, `4k`
`target_fps`	integer	—	`30`	Target FPS (15–60)

{
  "type": "string",
  "format": "uri"
}

Audio Models

BeatFusion 2.0

Property	Value
Namespace	`beatfusion-2.0`
Vendor	Skytells
Pricing	$0.75 / prediction
Capabilities	`text-to-audio`, `music`, `quality`, `audio`
Status	Operational

Skytells's flagship music generation model. Generate full-length songs with vocals, lyrics, and rich instrumentation. Supports structure tags: [Intro], [Verse], [Pre Chorus], [Chorus], [Bridge], [Outro], [Hook], [Solo], and more.

Parameter	Type	Required	Default	Description
`lyrics`	string	✅	—	Lyrics with structure tags (1–3500 chars). Use `\n` for line breaks
`prompt`	string	—	`""`	Music style description (0–2000 chars)
`sample_rate`	enum	—	`44100`	`16000`, `24000`, `32000`, `44100`
`bitrate`	enum	—	`256000`	`32000`, `64000`, `128000`, `256000`
`audio_format`	enum	—	`mp3`	`mp3`, `wav`, `pcm`

{
  "type": "string",
  "format": "uri"
}

Returns an audio file URL.

BeatFusion 1.0

Property	Value
Namespace	`beatfusion-1.0`
Vendor	Skytells
Pricing	$0.45 / prediction
Capabilities	`text-to-audio`, `music`, `quality`, `audio`
Status	Operational

First-generation music model. Supports [intro], [verse], [chorus], [bridge], [outro] tags.

Parameter	Type	Required	Default	Description
`prompt`	string	✅	—	Music style description (10–300 chars)
`lyrics`	string	✅	—	Lyrics with structure tags (10–600 chars)
`sample_rate`	enum	—	`44100`	`16000`, `24000`, `32000`, `44100`
`bitrate`	enum	—	`256000`	`32000`, `64000`, `128000`, `256000`
`audio_format`	enum	—	`mp3`	`mp3`, `wav`, `pcm`

{
  "type": "string",
  "format": "uri"
}

Understanding Model Schemas

Every model on Skytells defines its own input_schema and output_schema. These schemas follow JSON Schema conventions and describe exactly what parameters a model accepts and what it returns.

Why schemas differ

Different models are built for different tasks, so their inputs vary significantly:

Simple models like TrueFusion only need a prompt and optional aspect_ratio
Advanced models like TrueFusion Pro add controls for guidance, num_inference_steps, seed, and output format
Flagship models like TrueFusion Ultra support inpainting (image + mask), style references, and resolution presets
Reference models like TrueFusion 2.0 accept tagged reference images you can invoke by name in your prompt
Video models add duration, start_image/end_image, and cfg_scale for temporal control
Audio models use lyrics and prompt for compositional control with structure tags
Partner models (GPT-Image-1, Sora 2) require your own API key via openai_api_key

Common input parameters

Most image models share these parameters:

Parameter	Description
`prompt`	Always required. The text description of what to generate
`aspect_ratio`	Controls image dimensions. Available ratios vary by model
`seed`	For reproducible generations. Set the same seed to get identical outputs
`output_format`	Usually `webp`, `jpg`, or `png`

Output schema patterns

Models return one of two patterns:

Single output — returns one URL:

{ "type": "string", "format": "uri" }

Multiple outputs — returns an array of URLs:

{ "type": "array", "items": { "type": "string", "format": "uri" } }

Models that support num_outputs or number_of_images typically use the array pattern.

Pricing units

Unit	Description
`image`	Flat rate per generated image
`second`	Billed per second of generated video/audio
`prediction`	Flat rate per API call
`gpu`	Billed per GPU second used
`computing_second`	Billed per compute second
`image_megapixel`	Billed by output resolution in megapixels
`5 seconds`	Billed per 5-second chunk

How is this guide?

On this page