Create Chat Completion

POST /v1/chat/completions — full parameter reference with code examples, streaming format, tool calling, and OpenAPI spec.

POSThttps://api.skytells.ai/v1/chat/completions

Given a list of messages forming a conversation, the model returns the next reply. Supports synchronous JSON responses and token-by-token SSE streaming.

RESPONSE Returns a ChatCompletion object, or a stream of ChatCompletionChunk objects when stream: true.

The Chat API conforms to Inference API endpoints, and its request and response schemas are OpenAI-compatible with Skytells-specific safety additions.

See the Inference API overview for architecture and model catalog.

Request Body

Request for Chat API should be sent to /v1/chat/completions with a JSON body containing the following parameters:

modelrequiredstring

The model ID to use. Recommended: "deepbrain-router" (auto-routes to the best model). See Models for available identifiers. Other options: "gpt-5", "gpt-5.4", "gpt-5-nano".

messagesrequiredChatCompletionMessageParam[]

The conversation history as an array of message objects. Each object must have:

role — "system", "user", or "assistant"
content — a string, or an array of content parts for multimodal inputs (text + image)

The model processes these in order and generates the next turn.

streamboolean

When true, the response is a stream of server-sent events (SSE). Each event contains a ChatCompletionChunk. Also accepts ?stream=true as a query parameter. Default: false.

max_tokensinteger

Maximum number of tokens to generate. The model may stop earlier if a natural stop sequence is reached. Default: 8192.

temperaturenumber

Sampling temperature between 0 and 2. Lower values produce more focused and deterministic output; higher values produce more random output. Default: 0.7.

top_pnumber

Nucleus sampling — only tokens within the cumulative top top_p probability mass are considered. Range: 0 to 1. Default: 0.95. Not recommended to change both temperature and top_p.

ninteger

Number of completion choices to generate for each prompt. Each is a separate ChatMessage in choices[]. Default: 1.

stopstring | string[]

Up to 4 stop sequences. Generation halts when any of these strings is produced. The stop sequence itself is not included in the output.

presence_penaltynumber

Penalises tokens that have appeared at all in the output so far — encouraging the model to explore new topics. Range: -2.0 to 2.0. Default: 0.0.

frequency_penaltynumber

Penalises tokens based on their frequency in the output so far — reducing verbatim repetition. Range: -2.0 to 2.0. Default: 0.0.

logprobsboolean

Whether to return log probabilities of the output tokens. Default: false.

top_logprobsinteger

When logprobs is true, how many top token log probabilities to return per output token. Range: 0 to 20.

toolsChatCompletionTool[]

A list of tools (functions) the model may call. Each tool has type: "function" and a function object with name, description, and parameters (JSON Schema). See Tool Calling below.

tool_choicestring | object

Controls whether and how the model invokes tools. Options:

"none" — never call tools
"auto" — model decides (default when tools are present)
"required" — model must call at least one tool
{ type: "function", function: { name: "..." } } — force a specific function

response_formatobject

Output format hint. { type: "text" } (default) or { type: "json_object" } to guarantee valid JSON output. When using json_object, instruct the model to produce JSON in your system message.

userstring

An end-user identifier for abuse monitoring and usage attribution. This string is never exposed to the model.

Create a chat completion

REST

curl https://api.skytells.ai/v1/chat/completions \
  -H "x-api-key: $SKYTELLS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepbrain-router",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a short poem about the sea."}
    ],
    "max_tokens": 256,
    "temperature": 0.7
  }'

Response

200 OK

{
  "id": "chatcmpl-DLam6kYB250Quuq81KvS9iPRLnVGz",
  "object": "chat.completion",
  "created": 1774043853,
  "model": "gpt-5-nano-2025-08-07",
  "system_fingerprint": "fp_1133e9eff6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The sea rolls in with ancient grace...",
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "stop",
      "logprobs": null,
      "content_filter_results": {
        "hate": { "filtered": false, "severity": "safe" },
        "self_harm": { "filtered": false, "severity": "safe" },
        "sexual": { "filtered": false, "severity": "safe" },
        "violence": { "filtered": false, "severity": "safe" },
        "protected_material_code": { "detected": false, "filtered": false },
        "protected_material_text": { "detected": false, "filtered": false }
      }
    }
  ],
  "prompt_filter_results": [
    {
      "prompt_index": 0,
      "content_filter_results": {
        "hate": { "filtered": false, "severity": "safe" },
        "jailbreak": { "detected": false, "filtered": false },
        "self_harm": { "filtered": false, "severity": "safe" },
        "sexual": { "filtered": false, "severity": "safe" },
        "violence": { "filtered": false, "severity": "safe" }
      }
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 38,
    "total_tokens": 62,
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "accepted_prediction_tokens": 0,
      "audio_tokens": 0,
      "reasoning_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

Streaming

Set stream: true. The connection stays open and emits newline-delimited SSE events, each a ChatCompletionChunk. The stream terminates with data: [DONE].

Stream sequence:

First chunk — empty delta, carries prompt_filter_results at root level (prompt safety evaluated before generation begins)
Mid-stream chunks — delta.content contains incremental text; content_filter_results per choice
Final chunk — finish_reason: "stop", empty delta: {}
data: [DONE] — sentinel line, not JSON

Mid-stream SSE chunk

data: {
  "id": "chatcmpl-DLc8T...",
  "object": "chat.completion.chunk",
  "created": 1774043853,
  "model": "gpt-5-nano-2025-08-07",
  "choices": [{
    "index": 0,
    "delta": { "content": "The " },
    "finish_reason": null,
    "logprobs": null,
    "content_filter_results": {
      "hate":      { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "sexual":    { "filtered": false, "severity": "safe" },
      "violence":  { "filtered": false, "severity": "safe" }
    }
  }]
}

Streaming chat

REST

curl https://api.skytells.ai/v1/chat/completions \
  -H "x-api-key: $SKYTELLS_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "deepbrain-router",
    "messages": [{"role": "user", "content": "Tell me about the ocean."}],
    "stream": true
  }'

SSE chunk

Stream event

data: {
  "id": "chatcmpl-DLc8T...",
  "object": "chat.completion.chunk",
  "created": 1774043853,
  "model": "gpt-5-nano-2025-08-07",
  "choices": [{
    "index": 0,
    "delta": { "content": "The ocean" },
    "finish_reason": null,
    "content_filter_results": {
      "hate":      { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "sexual":    { "filtered": false, "severity": "safe" },
      "violence":  { "filtered": false, "severity": "safe" }
    }
  }]
}

data: [DONE]

Tool Calling

You must always submit tool results back. Leaving a tool_calls message without a subsequent tool response will cause the next completion call to fail.

Pass a tools array of function definitions. When the model decides to call one, finish_reason is "tool_calls" and choices[0].message.tool_calls contains the invocation(s). Execute the function locally, then send the result back as a "tool" role message to complete the loop.

Request fields

tools — array of ChatCompletionTool definitions, each with a function object containing name, description, and parameters (JSON Schema)
tool_choice — "auto" (default), "none", "required", or { type: "function", function: { name } } to force a specific function

Response fields (Step 1)

choices[0].finish_reason — will be "tool_calls" when the model wants to invoke a function
choices[0].message.tool_calls[] — array of ToolCall objects, each with id, function.name, and function.arguments (JSON string)
choices[0].message.content — null when finish_reason is tool_calls

Response fields (Step 2)

Returns a normal ChatCompletion with finish_reason: "stop" and the final answer in choices[0].message.content

Possible errors

INVALID_PARAMETER — malformed tools array or invalid JSON Schema in parameters
CONTENT_POLICY_VIOLATION — tool call or result blocked by safety policy
INFERENCE_ERROR — provider rejected the tool-calling request
INFERENCE_TIMEOUT — agentic loops with many tool turns can exceed the time limit; reduce turns or max_tokens

Step 1 — request with tools

REST

curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "deepbrain-router",
  "messages": [{"role":"user","content":"Weather in Paris?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get weather for a city",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }
  }],
  "tool_choice": "auto"
}'

The model returns a ChatCompletion object with finish_reason: "tool_calls" and choices[0].message.tool_calls populated.

Step 2 — submit tool result

REST

curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "deepbrain-router",
  "messages": [
    {"role": "user", "content": "Weather in Paris?"},
    {"role": "assistant", "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{"city":"Paris"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{"temp":"18°C"}"}
  ]
}'

Chat Completions In Action

The interactive spec below is generated directly from the Skytells OpenAPI schema. It covers every accepted parameter with its type, constraints, and default value. Use the Try it panel to send a live request straight from the browser — set your x-api-key in the auth field first.

Authorization

apiKey

x-api-key<token>

Your Skytells API key. Obtain one from Dashboard → API Keys.

In: header

Query Parameters

stream?boolean

Force streaming via query param. Equivalent to stream: true in the body.

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

model*string

The model namespace. Use deepbrain-router for automatic routing.

Value in"deepbrain-router" | "gpt-5" | "gpt-5.4"

messages*array<>

Conversation history as an array of message objects.

stream?boolean

If true, responses are streamed as server-sent events.

Defaultfalse

max_tokens?integer

Maximum number of tokens to generate.

Default8192

temperature?number

Sampling temperature (0–2). Lower = more deterministic.

Default0.7

Range0 <= value <= 2

top_p?number

Nucleus sampling probability (0–1).

Default0.95

Range0 <= value <= 1

frequency_penalty?number

Penalise tokens based on their frequency in the output (-2.0–2.0).

Default0

Range-2 <= value <= 2

presence_penalty?number

Penalise tokens that have appeared at all (-2.0–2.0).

Default0

Range-2 <= value <= 2

stop?|array<>

Stop sequences. Generation halts when any sequence is encountered.

Response Body

`application/json`

curl -X POST "https://api.skytells.ai/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "deepbrain-router",    "messages": [      {        "role": "system",        "content": "You are a helpful assistant."      },      {        "role": "user",        "content": "Hello!"      }    ]  }'

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1748000000,
  "model": "deepbrain-router",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 38,
    "total_tokens": 62
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

Expected Errors

You may encounter errors when calling this Chat API. Branch on error.error_id — never on error.message. See the Inference Error Reference for all error codes and retry guidance.

How is this guide?

Create a chat completion

Response

Streaming chat

SSE chunk

Step 1 — request with tools

Step 2 — submit tool result

200

400application/json

401application/json

402application/json

404application/json

422application/json

429application/json

500application/json

503application/json

504application/json

On this page

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`