Model Schemas

Understand how model input and output schemas work, and how they differ across model types.

How Schemas Work

Every model on Skytells publishes a JSON Schema for its input and output. When you create a prediction, the input object you send must conform to the model's input_schema. The response will match the output_schema.

You can retrieve any model's schema programmatically:

Get all models

curl https://api.skytells.ai/v1/models \
  -H "x-api-key: YOUR_API_KEY"

Or fetch a single model's schema by slug (GET /v1/models/{slug}):

Get a single model with schemas

curl "https://api.skytells.ai/v1/models/truefusion?fields=input_schema,output_schema" \
  -H "x-api-key: YOUR_API_KEY"

Each model in the response includes input_schema and output_schema fields.

Input Schema Structure

Input schemas follow JSON Schema conventions:

{
  "type": "object",
  "title": "Input",
  "required": ["prompt"],
  "properties": {
    "prompt": {
      "type": "string",
      "title": "Prompt",
      "description": "Text prompt for generation",
      "x-order": 0
    },
    "aspect_ratio": {
      "type": "string",
      "enum": ["1:1", "16:9", "9:16"],
      "default": "1:1",
      "x-order": 1
    }
  }
}

Key fields:

Field	Meaning
`required`	Parameters you must include in your request
`type`	Data type — `string`, `integer`, `number`, `boolean`, `array`
`enum`	Fixed set of allowed values
`default`	Value used if you omit the parameter
`minimum` / `maximum`	Numeric bounds
`format`	Special format — `uri` for URLs, `password` for secrets
`x-order`	Display ordering hint

Schema Complexity by Model Tier

Models range from minimal to highly configurable schemas. Here's how complexity scales:

Minimal schema (2–4 parameters)

Models like TrueFusion, Imagen 3, and Imagen 4 have streamlined schemas—just prompt, aspect_ratio, and maybe a negative_prompt or safety filter. These are ideal when you want fast results with minimal configuration.

{
  "input": {
    "prompt": "A sunset over mountains",
    "aspect_ratio": "16:9"
  }
}

Standard schema (8–12 parameters)

Models like TrueFusion Pro, TrueFusion Edge, and Flux.1 Edge add control over generation quality: num_inference_steps, guidance, seed, output_format, and output_quality. You get fine-grained tuning without overwhelming options.

{
  "input": {
    "prompt": "A cyberpunk cityscape at night",
    "aspect_ratio": "21:9",
    "num_inference_steps": 35,
    "guidance": 5,
    "seed": 42,
    "output_format": "png",
    "output_quality": 95
  }
}

Advanced schema (12+ parameters)

Models like TrueFusion Standard (with LoRA support), TrueFusion Ultra (with inpainting + style references), and Flux 2 Flex (with 10 reference images + prompt upsampling) expose the full range of creative control.

{
  "input": {
    "prompt": "Portrait in the style of @ref, soft lighting",
    "reference_images": ["https://example.com/style.jpg"],
    "reference_tags": ["ref"],
    "resolution": "1080p",
    "aspect_ratio": "4:3",
    "seed": 12345
  }
}

Output Schema Patterns

There are only two output patterns across all models:

Single URL

Most models return a single URL string:

{
  "type": "string",
  "format": "uri"
}

Response:

{
  "output": "https://delivery.skytells.ai/abc123.jpg"
}

Used by: TrueFusion Max, TrueFusion Ultra, TrueFusion 2.0, all video models, all audio models, Imagen, FLUX.2 Pro.

Array of URLs

Models that support multiple outputs return an array:

{
  "type": "array",
  "items": { "type": "string", "format": "uri" }
}

Response:

{
  "output": [
    "https://delivery.skytells.ai/abc123.jpg",
    "https://delivery.skytells.ai/def456.jpg"
  ]
}

Used by: TrueFusion, TrueFusion Pro, TrueFusion Edge, TrueFusion Standard, GPT-Image-1.

Always check the output schema of the model you're using. If you expect an array but the model returns a single string (or vice versa), your parsing logic will break.

Type-Specific Schema Patterns

Image models

Common parameters: prompt, aspect_ratio, seed, output_format, output_quality

Advanced parameters vary by model:

img2img: image, prompt_strength
Inpainting: image, mask
Reference-based: reference_images, reference_tags, image_prompt, input_images
LoRA: lora_weights, lora_scale
Quality control: num_inference_steps, guidance, guidance_scale
Speed: go_fast, speed_mode, megapixels

Video models

Common parameters: prompt, aspect_ratio, duration/seconds

Model-specific patterns:

Frame control: start_image, end_image, last_frame
Reference: reference_images, input_reference
Audio: generate_audio (Veo), audio (Wan 2.5)
Flexibility: cfg_scale (lower = more creative)
Safety: negative_prompt, person_generation

Audio models

Unique parameters:

lyrics — song lyrics with structure tags ([Verse], [Chorus], etc.)
prompt — style/mood description
sample_rate, bitrate, audio_format — audio encoding settings

Pricing Models

Different models use different billing units. Understanding these helps you estimate costs accurately.

Pricing Unit	How it works	Example
Per image	Flat rate per generated image	TrueFusion: $0.03/image
Per second	Billed by output duration	Veo 3.1: $0.43/second
Per prediction	Flat rate per API call	Mera: $3.42/prediction
Per GPU second	Billed by GPU compute time	TrueFusion Pano: $0.02/GPU s
Per computing second	Billed by total compute time	TrueFusion Optima: $0.008/s
Per megapixel	Billed by output resolution	FLUX.2 Pro: $0.02/MP
Per 5 seconds	Chunked video billing	Video Upscale: $0.10/5s

Some models have conditional pricing based on parameters — for example, Veo 3.1 Fast charges differently depending on whether generate_audio is enabled.

Check the model's pricing.criterias field in the API response for conditional pricing rules. Some models charge differently based on resolution, generate_audio, or other input parameters.

Partner Models

Some models are served through partner APIs (OpenAI, Google). These models have "inference_party": "partner" in their metadata and typically require you to provide your own API key:

{
  "input": {
    "openai_api_key": "sk-...",
    "prompt": "A watercolor painting of a cottage"
  }
}

Partner-served models: GPT-Image-1, Sora 2, Sora 2 Pro

The API key field uses "format": "password" and "x-cog-secret": true — it is never logged or stored by Skytells.

How is this guide?

On this page