Errors

Inference Errors

Complete error reference for the Skytells Inference API — all error_id values, HTTP status codes, and resolution guides.

Inference API errors

This catalog extends API errors. That page defines shared HTTP semantics, authentication, credits, 429 rate limiting, and client patterns for the API as a whole.

Inference sub-APIs like Chat Completions, Responses, and Embeddings return failures as an Inference error response (nested error field — see Fields on the nested error object on that page). Use error.error_id (and error.request_id for support), not free-form messages.


Using request_id

If you contact support or need to investigate a failed request, include error.request_id. It uniquely identifies the request in Skytells logs and allows the support team to trace the full execution path.


Error schema

Inference failures return an OpenAI-compatible JSON body with a top-level error object. Field reference: Inference error response. Branch on error.error_id.


Inference Error Catalog

The Inference API extends the API errors catalog with the following errors:


Authentication & access

UNAUTHORIZED — 401

FieldValue
HTTP401
typeauthentication_error
codeinvalid_api_key (typical)

When: x-api-key is missing or not accepted.

Resolution: Verify the key in the Console. Shared semantics: 401 — Unauthorized.


FORBIDDEN — 403

FieldValue
HTTP403
typeTypically permission_error or invalid_request_error
codeVaries

When: The key is syntactically valid but the request is not allowed (plan, route, or account state).

Resolution: Check model access and account status; see 403 — Forbidden. Do not confuse with 401 (UNAUTHORIZED).


Request validation

Invalid JSON or parameters align with 400 — Bad Request / Infrastructure. Inference encodes them with the nested fields below.

INVALID_REQUEST — 400

FieldValue
HTTP400
typeinvalid_request_error
codeinvalid_request

When: The request body is malformed JSON or has an invalid shape.

Resolution: Validate your JSON payload and ensure it matches the endpoint schema. Set Content-Type: application/json.


INVALID_PARAMETER — 400

FieldValue
HTTP400
typeinvalid_request_error
codeinvalid_parameter

When: A required parameter is missing or a parameter has an invalid value (e.g. missing model, messages not an array).

Resolution: Check error.param for the offending field. Review the request schema.


MODEL_NOT_FOUND — 404

FieldValue
HTTP404
typeserver_error
codemodel_not_found

When: The model namespace provided is unknown or not available on the Inference API.

Resolution: Use a valid namespace from the Text Models catalog: deepbrain-router, gpt-5, gpt-5.4, or llama-3.1-8b. See also 404 — Not Found for the same MODEL_NOT_FOUND pattern on other routes.


ENDPOINT_NOT_FOUND — 404

FieldValue
HTTP404
typeserver_error
codeendpoint_not_found

When: The requested path does not match any supported endpoint.

Resolution: Use a supported endpoint: /v1/chat/completions, /v1/responses, or /v1/embeddings. See the Inference overview.


Billing & Credits

INSUFFICIENT_CREDITS — 402

FieldValue
HTTP402
typeserver_error
codeinsufficient_credits

When: Your account balance is insufficient for the request (including a safety margin).

Resolution: Same as 402 — Payment Required (INSUFFICIENT_CREDITS / credits). Add credits in Billing or reduce max_tokens / prompt size.


CREDIT_CHECK_FAILED — 402

FieldValue
HTTP402
typeserver_error
codecredit_check_failed

When: The credit-verification service was temporarily unreachable and could not confirm your balance.

Resolution: Retry the request. If the problem persists, contact Support with error.request_id.


Rate Limiting

RATE_LIMITED — 429

FieldValue
HTTP429
typerate_limit_error
coderate_limit_exceeded

When: Too many requests sent in a short time window.

Resolution: Follow 429 — Too Many Requests and Rate limits (Retry-After, x-skytells-ratelimit-*, backoff). Inference uses error.error_id: RATE_LIMITED with error.code: rate_limit_exceeded.


INFERENCE_RATE_LIMITED — 503

FieldValue
HTTP503
typeservice_unavailable_error
codeservice_unavailable

When: The inference layer is temporarily busy or rate-limited at the infrastructure level.

Resolution: Retry with exponential backoff. Consider reducing concurrency. This is transient.


Inference (Sanitized)

All inference-level errors are sanitized — no upstream or infrastructure details are ever exposed.

INFERENCE_TIMEOUT — 504

FieldValue
HTTP504
typetimeout_error
codetimeout_error

When: The inference request exceeded the maximum allowed time.

Resolution: Retry the request. If it persists, reduce max_tokens or shorten the prompt. Avoid extremely long generations.


SERVICE_UNAVAILABLE — 503

FieldValue
HTTP503
typeservice_unavailable_error
codeservice_unavailable

When: The inference service is temporarily unavailable.

Resolution: Retry with exponential backoff. Check status.skytells.ai for active incidents.


INFERENCE_ERROR — 500

FieldValue
HTTP500
typeserver_error
codeservice_error

When: Inference failed for an unspecified reason (sanitized upstream error).

Resolution: Retry the request. If the issue persists, report error.request_id to Support.


INTERNAL_ERROR — 500

FieldValue
HTTP500
typeserver_error
codeservice_error

When: An unexpected error occurred in the Skytells gateway.

Resolution: Retry the request. If it persists for more than a few minutes, check status.skytells.ai and report error.request_id to Support.


Safety & Policy

CONTENT_POLICY_VIOLATION — 422

FieldValue
HTTP422
typeinvalid_request_error
codecontent_policy_violation

When: The request was blocked by Skytells' safety or content policy.

Resolution: Modify or rephrase the prompt; do not retry with the same content. See 422 — Unprocessable Entity for related API-level error_id values and Responsible AI.


Retry guidance

Error IDSafe to retry?
INFERENCE_RATE_LIMITED✅ Yes — with exponential backoff
SERVICE_UNAVAILABLE✅ Yes — with exponential backoff
INFERENCE_TIMEOUT✅ Yes — consider reducing max_tokens
INFERENCE_ERROR✅ Yes — if persistent, report request_id
INTERNAL_ERROR✅ Yes — if persistent, report request_id
CREDIT_CHECK_FAILED✅ Yes — transient verification failure
INVALID_REQUEST❌ Fix the request first
INVALID_PARAMETER❌ Fix the parameter first
MODEL_NOT_FOUND❌ Use a valid model namespace
ENDPOINT_NOT_FOUND❌ Use a supported endpoint path
UNAUTHORIZED❌ Fix the API key first
FORBIDDEN❌ Check permissions first
INSUFFICIENT_CREDITS❌ Add credits before retrying
CONTENT_POLICY_VIOLATION❌ Do not retry with the same content

Handling Errors in Code

Inference error handling

OpenAI SDK
from openai import OpenAI, APIStatusError

client = OpenAI(
    api_key="YOUR_SKYTELLS_API_KEY",
    base_url="https://api.skytells.ai/v1",
)

try:
    response = client.chat.completions.create(
        model="deepbrain-router",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

except APIStatusError as e:
    body = e.response.json().get("error", {})
    error_id = body.get("error_id")
    request_id = body.get("request_id")

    if error_id == "INSUFFICIENT_CREDITS":
        print("Top up your balance at console.skytells.ai/billing")
    elif error_id == "RATE_LIMITED":
        # implement exponential backoff
        import time; time.sleep(2)
    elif error_id in ("INFERENCE_ERROR", "INTERNAL_ERROR"):
        print(f"Transient error — retry. request_id={request_id}")
    elif error_id == "CONTENT_POLICY_VIOLATION":
        print("Content blocked — modify your prompt")
    else:
        print(f"Unhandled error: {error_id}{body.get('message')}")

How is this guide?

On this page