API Schema

Skytells API v1 — a RESTful, resource-based API for AI predictions, models, and more.

API v1

The Skytells API v1 is a RESTful, resource-based API. Every endpoint maps to a resource — models, predictions, webhooks — following standard HTTP conventions.

All endpoints are served under:

https://api.skytells.ai/v1

How It Works

A single request travels through four stages before you receive a response:

x-api-key header valid ok invalid bad input Your App Auth Validate Process Response 401 Error 400 Error

API Schemas

The API is organized around two interaction models.

Predictions for media generation and Inference for other tasks like text, code, reasoning, compute etc.

Difference between Predictions and Inference

TaskPredictionsInference
Media generationYesDepending on the model
Text, code, reasoning, computeDepends on the modelYes
ComputeSkytells-tasksYes
SynchronousYESYes
WebhooksYesTask Specific
PollingYesYes
OpenAI-compatibleNoYes — at the sub-API level: Chat Completions, Responses, and Embeddings each follow OpenAI-compatible schemas
StreamingTask SpecificYes when stream is enabled
SchemaPrediction Object SchemaSub-API specific: Chat Completions, Responses, Embeddings — all follow OpenAI-compatible schemas augmented with Skytells safety fields

How models are mapped to APIs

Each model is mapped to the appropriate API based on its capabilities and requirements. Every model has a unique API type, and each API type has a unique schema, These schemas are defined in the Model Catalog including the inference API for the model.

PLAYGROUND / CONSOLE

If you're inferenceing a model or creating a prediction, you can use the Playground / Console to test the model and see the response schema, Schema and inference endpoint are clearly shown in the UI under the "API" tab.

Good to know that:

API Flow

Here's how the API endpoints orchestrate the two interaction models:

Skytells API v1 Orchestrator Layer Models API Predictions API Inference API Status GET /v1/models GET /v1/models/:slug POST /v1/predictions GET /v1/predictions/:id GET /v1/predictions DELETE /v1/predictions/:id POST /v1/chat/completions POST /v1/responses POST /v1/embeddings Webhooks Task status events Service health

When API v1 receives a request, Skytells's Orchestrator Layer determines whether it should be handled by the Predictions API or the Inference API. As task state changes occur within either API, webhook events are emitted through the Webhook Dispatcher to notify subscribed systems.

For Inference API flow, jump to the Inference API Lifecycle section.


Prediction Lifecycle

A prediction moves through a series of states from the moment you create it until it completes. At each transition, Skytells can fire a webhook to your server so you never need to poll.

POST /v1/predictions Worker picked up Model loaded Output ready Error occurred User cancelled User cancelled pending starting processing succeeded failed cancelled

Status Meanings

StatusDescription
pendingRequest accepted, waiting for a worker to pick it up
startingWorker assigned, model is loading into memory
processingModel is actively running inference on your input
succeededInference completed — outputs are available in output[]
failedAn error occurred — check error field for details
cancelledCancelled by the user before completion

Webhook Triggers

Every status transition can fire a webhook to your server. Subscribe to any combination of events on each prediction request:

EventFired when status reaches
prediction.startedprocessing
prediction.succeededsucceeded
prediction.failedfailed
prediction.cancelledcancelled

Prediction Object Schema

Prediction + id: string + status: string + type: string + stream: boolean + source: string + privacy: string + created_at: string + started_at: string + completed_at: string + updated_at: string Model + name: string + type: string Metrics + predict_time: float + total_time: float Metadata + billing: Billing Billing + credits_used: float Webhook + url: string + events: string[] URLs + get: string + cancel: string + stream: string + delete: string model metrics metadata webhook urls billing

For detailed information on the Prediction Object JSON Schema, see the Prediction API reference.


Inference Lifecycle

Inference requests are synchronous — you send a request and receive a response in a single connection. There is no polling, no prediction ID, and no webhook needed.

alt [Model served by Skytells] [Model served by partner] POST /v1/chat/completions (or /v1/responses, /v1/embeddings) Authorization Gateway Authorized Resolve model namespace Routing decision Skytells's Global GPU Infrastructure Generated tokens Forward request (via contract) Generated tokens Response (sub-API schema: Chat / Responses / Embeddings) Your App Skytells Gateway Authorization Gateway Model Router Skytells Infra Partner Infra

Inference Infrastructure

Where a request is executed depends on the model:

Model SourceInference LocationNotes
Skytells native modelsSkytells global infrastructureLow-latency, edge-compatible nodes. No data leaves Skytells.
Partner-provided modelsPartner infrastructureExecution is delegated to the partner under a formal infrastructure agreement. Skytells remains the billing and API layer.

Inference API Schema

The Inference API is not a single schema — it is an umbrella for three sub-APIs, each with its own schema:

  • Chat Completions — OpenAI Chat Completions schema, plus Skytells content_filter_results and prompt_filter_results safety fields.
  • Responses — OpenAI Responses schema, plus Skytells content_filters[] array for prompt and completion safety.
  • Embeddings — OpenAI Embeddings schema.

Because the sub-APIs follow OpenAI-compatible schemas as their base, existing OpenAI SDK code requires only a base_url change. The additional Skytells safety fields are returned alongside the standard fields and are safe to ignore.

base_url = api.skytells.ai/v1 Chat/Responses/Embeddings route OpenAI-compatible schema + Skytells safety fields Your App Skytells Gateway Inference Sub-API

Authentication

All requests require your API key in the x-api-key header. For setup, examples, and security best practices, see Authentication.


HTTP Methods

MethodUsageExample
GETRetrieve a resource or listGET /v1/models
POSTCreate a new resourcePOST /v1/predictions
DELETERemove a resourceDELETE /v1/predictions/{id}

Pagination

List endpoints support pagination via query parameters:

ParameterTypeDefaultDescription
per_pageinteger15Results per page (max 50)
pageinteger1Page number

Error Codes

All errors return a JSON body with an error_id and message. See the full API Errors reference for descriptions and resolution steps.

HTTP Statuserror_id
400INVALID_INPUT · INVALID_PARAMETER
401UNAUTHORIZED · ACCOUNT_SUSPENDED
402PAYMENT_REQUIRED · INSUFFICIENT_CREDITS
403SECURITY_VIOLATION
404MODEL_NOT_FOUND
429RATE_LIMIT_EXCEEDED
500INTERNAL_ERROR

Resources

How is this guide?

On this page