Inference Errors
Complete error reference for the Skytells Inference API — all error_id values, HTTP status codes, and resolution guides.
Inference API errors
This catalog extends API errors. That page defines shared HTTP semantics, authentication, credits, 429 rate limiting, and client patterns for the API as a whole.
Inference sub-APIs like Chat Completions, Responses, and Embeddings return failures as an Inference error response (nested error field — see Fields on the nested error object on that page). Use error.error_id (and error.request_id for support), not free-form messages.
Branch on error.error_id in your code — never on error.message. The error_id is a stable uppercase identifier. Messages are human-readable and may change between releases.
Using request_id
If you contact support or need to investigate a failed request, include error.request_id. It uniquely identifies the request in Skytells logs and allows the support team to trace the full execution path.
Error schema
Inference failures return an OpenAI-compatible JSON body with a top-level error object. Field reference: Inference error response. Branch on error.error_id.
Inference Error Catalog
The Inference API extends the API errors catalog with the following errors:
Authentication & access
UNAUTHORIZED — 401
| Field | Value |
|---|---|
| HTTP | 401 |
| type | authentication_error |
| code | invalid_api_key (typical) |
When: x-api-key is missing or not accepted.
Resolution: Verify the key in the Console. Shared semantics: 401 — Unauthorized.
FORBIDDEN — 403
| Field | Value |
|---|---|
| HTTP | 403 |
| type | Typically permission_error or invalid_request_error |
| code | Varies |
When: The key is syntactically valid but the request is not allowed (plan, route, or account state).
Resolution: Check model access and account status; see 403 — Forbidden. Do not confuse with 401 (UNAUTHORIZED).
Request validation
Invalid JSON or parameters align with 400 — Bad Request / Infrastructure. Inference encodes them with the nested fields below.
INVALID_REQUEST — 400
| Field | Value |
|---|---|
| HTTP | 400 |
| type | invalid_request_error |
| code | invalid_request |
When: The request body is malformed JSON or has an invalid shape.
Resolution: Validate your JSON payload and ensure it matches the endpoint schema. Set Content-Type: application/json.
INVALID_PARAMETER — 400
| Field | Value |
|---|---|
| HTTP | 400 |
| type | invalid_request_error |
| code | invalid_parameter |
When: A required parameter is missing or a parameter has an invalid value (e.g. missing model, messages not an array).
Resolution: Check error.param for the offending field. Review the request schema.
MODEL_NOT_FOUND — 404
| Field | Value |
|---|---|
| HTTP | 404 |
| type | server_error |
| code | model_not_found |
When: The model namespace provided is unknown or not available on the Inference API.
Resolution: Use a valid namespace from the Text Models catalog: deepbrain-router, gpt-5, gpt-5.4, or llama-3.1-8b. See also 404 — Not Found for the same MODEL_NOT_FOUND pattern on other routes.
ENDPOINT_NOT_FOUND — 404
| Field | Value |
|---|---|
| HTTP | 404 |
| type | server_error |
| code | endpoint_not_found |
When: The requested path does not match any supported endpoint.
Resolution: Use a supported endpoint: /v1/chat/completions, /v1/responses, or /v1/embeddings. See the Inference overview.
Billing & Credits
INSUFFICIENT_CREDITS — 402
| Field | Value |
|---|---|
| HTTP | 402 |
| type | server_error |
| code | insufficient_credits |
When: Your account balance is insufficient for the request (including a safety margin).
Resolution: Same as 402 — Payment Required (INSUFFICIENT_CREDITS / credits). Add credits in Billing or reduce max_tokens / prompt size.
CREDIT_CHECK_FAILED — 402
| Field | Value |
|---|---|
| HTTP | 402 |
| type | server_error |
| code | credit_check_failed |
When: The credit-verification service was temporarily unreachable and could not confirm your balance.
Resolution: Retry the request. If the problem persists, contact Support with error.request_id.
Rate Limiting
RATE_LIMITED — 429
| Field | Value |
|---|---|
| HTTP | 429 |
| type | rate_limit_error |
| code | rate_limit_exceeded |
When: Too many requests sent in a short time window.
Resolution: Follow 429 — Too Many Requests and Rate limits (Retry-After, x-skytells-ratelimit-*, backoff). Inference uses error.error_id: RATE_LIMITED with error.code: rate_limit_exceeded.
INFERENCE_RATE_LIMITED — 503
| Field | Value |
|---|---|
| HTTP | 503 |
| type | service_unavailable_error |
| code | service_unavailable |
When: The inference layer is temporarily busy or rate-limited at the infrastructure level.
Resolution: Retry with exponential backoff. Consider reducing concurrency. This is transient.
Inference (Sanitized)
All inference-level errors are sanitized — no upstream or infrastructure details are ever exposed.
INFERENCE_TIMEOUT — 504
| Field | Value |
|---|---|
| HTTP | 504 |
| type | timeout_error |
| code | timeout_error |
When: The inference request exceeded the maximum allowed time.
Resolution: Retry the request. If it persists, reduce max_tokens or shorten the prompt. Avoid extremely long generations.
SERVICE_UNAVAILABLE — 503
| Field | Value |
|---|---|
| HTTP | 503 |
| type | service_unavailable_error |
| code | service_unavailable |
When: The inference service is temporarily unavailable.
Resolution: Retry with exponential backoff. Check status.skytells.ai for active incidents.
INFERENCE_ERROR — 500
| Field | Value |
|---|---|
| HTTP | 500 |
| type | server_error |
| code | service_error |
When: Inference failed for an unspecified reason (sanitized upstream error).
Resolution: Retry the request. If the issue persists, report error.request_id to Support.
INTERNAL_ERROR — 500
| Field | Value |
|---|---|
| HTTP | 500 |
| type | server_error |
| code | service_error |
When: An unexpected error occurred in the Skytells gateway.
Resolution: Retry the request. If it persists for more than a few minutes, check status.skytells.ai and report error.request_id to Support.
Safety & Policy
CONTENT_POLICY_VIOLATION — 422
| Field | Value |
|---|---|
| HTTP | 422 |
| type | invalid_request_error |
| code | content_policy_violation |
When: The request was blocked by Skytells' safety or content policy.
Resolution: Modify or rephrase the prompt; do not retry with the same content. See 422 — Unprocessable Entity for related API-level error_id values and Responsible AI.
Retry guidance
| Error ID | Safe to retry? |
|---|---|
INFERENCE_RATE_LIMITED | ✅ Yes — with exponential backoff |
SERVICE_UNAVAILABLE | ✅ Yes — with exponential backoff |
INFERENCE_TIMEOUT | ✅ Yes — consider reducing max_tokens |
INFERENCE_ERROR | ✅ Yes — if persistent, report request_id |
INTERNAL_ERROR | ✅ Yes — if persistent, report request_id |
CREDIT_CHECK_FAILED | ✅ Yes — transient verification failure |
INVALID_REQUEST | ❌ Fix the request first |
INVALID_PARAMETER | ❌ Fix the parameter first |
MODEL_NOT_FOUND | ❌ Use a valid model namespace |
ENDPOINT_NOT_FOUND | ❌ Use a supported endpoint path |
UNAUTHORIZED | ❌ Fix the API key first |
FORBIDDEN | ❌ Check permissions first |
INSUFFICIENT_CREDITS | ❌ Add credits before retrying |
CONTENT_POLICY_VIOLATION | ❌ Do not retry with the same content |
Handling Errors in Code
Inference error handling
from openai import OpenAI, APIStatusError
client = OpenAI(
api_key="YOUR_SKYTELLS_API_KEY",
base_url="https://api.skytells.ai/v1",
)
try:
response = client.chat.completions.create(
model="deepbrain-router",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
except APIStatusError as e:
body = e.response.json().get("error", {})
error_id = body.get("error_id")
request_id = body.get("request_id")
if error_id == "INSUFFICIENT_CREDITS":
print("Top up your balance at console.skytells.ai/billing")
elif error_id == "RATE_LIMITED":
# implement exponential backoff
import time; time.sleep(2)
elif error_id in ("INFERENCE_ERROR", "INTERNAL_ERROR"):
print(f"Transient error — retry. request_id={request_id}")
elif error_id == "CONTENT_POLICY_VIOLATION":
print("Content blocked — modify your prompt")
else:
print(f"Unhandled error: {error_id} — {body.get('message')}")How is this guide?