API Schema
Skytells API v1 — a RESTful, resource-based API for AI predictions, models, and more.
API v1
The Skytells API v1 is a RESTful, resource-based API. Every endpoint maps to a resource — models, predictions, webhooks — following standard HTTP conventions.
All endpoints are served under:
https://api.skytells.ai/v1How It Works
A single request travels through four stages before you receive a response:
API Schemas
The API is organized around two interaction models.
Predictions for media generation and Inference for other tasks like text, code, reasoning, compute etc.
Difference between Predictions and Inference
| Task | Predictions | Inference |
|---|---|---|
| Media generation | Yes | Depending on the model |
| Text, code, reasoning, compute | Depends on the model | Yes |
| Compute | Skytells-tasks | Yes |
| Synchronous | YES | Yes |
| Webhooks | Yes | Task Specific |
| Polling | Yes | Yes |
| OpenAI-compatible | No | Yes — at the sub-API level: Chat Completions, Responses, and Embeddings each follow OpenAI-compatible schemas |
| Streaming | Task Specific | Yes when stream is enabled |
| Schema | Prediction Object Schema | Sub-API specific: Chat Completions, Responses, Embeddings — all follow OpenAI-compatible schemas augmented with Skytells safety fields |
How models are mapped to APIs
Each model is mapped to the appropriate API based on its capabilities and requirements. Every model has a unique API type, and each API type has a unique schema, These schemas are defined in the Model Catalog including the inference API for the model.
PLAYGROUND / CONSOLEIf you're inferenceing a model or creating a prediction, you can use the Playground / Console to test the model and see the response schema, Schema and inference endpoint are clearly shown in the UI under the "API" tab.
Good to know that:
- Predictions API always uses the Prediction Object Schema.
- Inference API operates through its sub-APIs — Chat Completions, Responses, and Embeddings — each with their own OpenAI-compatible schema augmented by Skytells safety fields.
API Flow
Here's how the API endpoints orchestrate the two interaction models:
When API v1 receives a request, Skytells's Orchestrator Layer determines whether it should be handled by the Predictions API or the Inference API. As task state changes occur within either API, webhook events are emitted through the Webhook Dispatcher to notify subscribed systems.
For Inference API flow, jump to the Inference API Lifecycle section.
Prediction API
Asynchronous/Synchronous inference for image, video, and audio models. Responses are queued, tracked by ID, and delivered via webhook or polling.
Inference API
Synchronous/Streaming inference for LLMs, text, code, reasoning, and embeddings. Sub-APIs (Chat Completions, Responses, Embeddings) follow OpenAI-compatible schemas — swap base_url and keep your existing code.
Prediction Lifecycle
A prediction moves through a series of states from the moment you create it until it completes. At each transition, Skytells can fire a webhook to your server so you never need to poll.
Status Meanings
| Status | Description |
|---|---|
pending | Request accepted, waiting for a worker to pick it up |
starting | Worker assigned, model is loading into memory |
processing | Model is actively running inference on your input |
succeeded | Inference completed — outputs are available in output[] |
failed | An error occurred — check error field for details |
cancelled | Cancelled by the user before completion |
Webhook Triggers
Every status transition can fire a webhook to your server. Subscribe to any combination of events on each prediction request:
| Event | Fired when status reaches |
|---|---|
prediction.started | processing |
prediction.succeeded | succeeded |
prediction.failed | failed |
prediction.cancelled | cancelled |
Skip polling — use webhooks. Instead of repeatedly calling GET /v1/predictions/:id to check if a prediction is done, register a webhook URL in your prediction request. Skytells will POST the full prediction object to your server the moment anything changes. See the Webhooks reference for setup, event payloads, and signature verification.
Prediction Object Schema
For detailed information on the Prediction Object JSON Schema, see the Prediction API reference.
Please note that the Prediction Object Schema is model-dependent. Text models (deepbrain-router, gpt-5, gpt-5.4) expose the full OpenAI-compatible schema. Image, video, and audio models use the Predictions API which has a different request format. Always check the Model Catalog for a model's API type before integrating.
Inference Lifecycle
Inference requests are synchronous — you send a request and receive a response in a single connection. There is no polling, no prediction ID, and no webhook needed.
Inference Infrastructure
Where a request is executed depends on the model:
| Model Source | Inference Location | Notes |
|---|---|---|
| Skytells native models | Skytells global infrastructure | Low-latency, edge-compatible nodes. No data leaves Skytells. |
| Partner-provided models | Partner infrastructure | Execution is delegated to the partner under a formal infrastructure agreement. Skytells remains the billing and API layer. |
Regardless of where inference is executed, all requests are authenticated, billed, and governed by Skytells. Partner infrastructure arrangements are contractual and transparent — they do not change the API surface, pricing model, or your data handling obligations with Skytells.
Inference API Schema
The Inference API is not a single schema — it is an umbrella for three sub-APIs, each with its own schema:
- Chat Completions — OpenAI Chat Completions schema, plus Skytells
content_filter_resultsandprompt_filter_resultssafety fields. - Responses — OpenAI Responses schema, plus Skytells
content_filters[]array for prompt and completion safety. - Embeddings — OpenAI Embeddings schema.
Because the sub-APIs follow OpenAI-compatible schemas as their base, existing OpenAI SDK code requires only a base_url change. The additional Skytells safety fields are returned alongside the standard fields and are safe to ignore.
OpenAI compatibility applies at the sub-API level — not at the Inference API level as a whole. Image, video, and audio models use the Predictions API which has a different request format. Always check the Model Catalog for a model's API type before integrating.
Authentication
All requests require your API key in the x-api-key header. For setup, examples, and security best practices, see Authentication.
HTTP Methods
| Method | Usage | Example |
|---|---|---|
GET | Retrieve a resource or list | GET /v1/models |
POST | Create a new resource | POST /v1/predictions |
DELETE | Remove a resource | DELETE /v1/predictions/{id} |
Pagination
List endpoints support pagination via query parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
per_page | integer | 15 | Results per page (max 50) |
page | integer | 1 | Page number |
Error Codes
All errors return a JSON body with an error_id and message. See the full API Errors reference for descriptions and resolution steps.
| HTTP Status | error_id |
|---|---|
400 | INVALID_INPUT · INVALID_PARAMETER |
401 | UNAUTHORIZED · ACCOUNT_SUSPENDED |
402 | PAYMENT_REQUIRED · INSUFFICIENT_CREDITS |
403 | SECURITY_VIOLATION |
404 | MODEL_NOT_FOUND |
429 | RATE_LIMIT_EXCEEDED |
500 | INTERNAL_ERROR |
Resources
Models
Browse all available AI models, their capabilities, input schemas, and pricing.
Prediction API
Asynchronous inference for image, video, and audio models. Create, retrieve, cancel, and delete predictions.
Inference API
Synchronous text generation via /v1/chat/completions, /v1/responses, and /v1/embeddings. OpenAI-compatible.
Webhooks
Receive real-time notifications when prediction lifecycle events occur.
Status
Monitor real-time availability and incidents for all Skytells services.
How is this guide?