Create Chat Completion
POST /v1/chat/completions — full parameter reference with code examples, streaming format, tool calling, and OpenAPI spec.
Given a list of messages forming a conversation, the model returns the next reply. Supports synchronous JSON responses and token-by-token SSE streaming.
RESPONSE Returns a ChatCompletion object, or a stream of ChatCompletionChunk objects when stream: true.
The Chat API conforms to Inference API endpoints, and its request and response schemas are OpenAI-compatible with Skytells-specific safety additions.
See the Inference API overview for architecture and model catalog.
Request Body
Request for Chat API should be sent to /v1/chat/completions with a JSON body containing the following parameters:
role—"system","user", or"assistant"content— a string, or an array of content parts for multimodal inputs (text + image)"none"— never call tools"auto"— model decides (default when tools are present)"required"— model must call at least one tool{ type: "function", function: { name: "..." } }— force a specific function
modelrequiredstring
The model ID to use. Recommended: "deepbrain-router" (auto-routes to the best model). See Models for available identifiers. Other options: "gpt-5", "gpt-5.4", "gpt-5-nano".
messagesrequiredChatCompletionMessageParam[]
The conversation history as an array of message objects. Each object must have:
The model processes these in order and generates the next turn.
streamboolean
When true, the response is a stream of server-sent events (SSE). Each event contains a ChatCompletionChunk. Also accepts ?stream=true as a query parameter. Default: false.
max_tokensinteger
Maximum number of tokens to generate. The model may stop earlier if a natural stop sequence is reached. Default: 8192.
temperaturenumber
Sampling temperature between 0 and 2. Lower values produce more focused and deterministic output; higher values produce more random output. Default: 0.7.
top_pnumber
Nucleus sampling — only tokens within the cumulative top top_p probability mass are considered. Range: 0 to 1. Default: 0.95. Not recommended to change both temperature and top_p.
ninteger
Number of completion choices to generate for each prompt. Each is a separate ChatMessage in choices[]. Default: 1.
stopstring | string[]
Up to 4 stop sequences. Generation halts when any of these strings is produced. The stop sequence itself is not included in the output.
presence_penaltynumber
Penalises tokens that have appeared at all in the output so far — encouraging the model to explore new topics. Range: -2.0 to 2.0. Default: 0.0.
frequency_penaltynumber
Penalises tokens based on their frequency in the output so far — reducing verbatim repetition. Range: -2.0 to 2.0. Default: 0.0.
logprobsboolean
Whether to return log probabilities of the output tokens. Default: false.
top_logprobsinteger
When logprobs is true, how many top token log probabilities to return per output token. Range: 0 to 20.
toolsChatCompletionTool[]
A list of tools (functions) the model may call. Each tool has type: "function" and a function object with name, description, and parameters (JSON Schema). See Tool Calling below.
tool_choicestring | object
Controls whether and how the model invokes tools. Options:
response_formatobject
Output format hint. { type: "text" } (default) or { type: "json_object" } to guarantee valid JSON output. When using json_object, instruct the model to produce JSON in your system message.
userstring
An end-user identifier for abuse monitoring and usage attribution. This string is never exposed to the model.
Create a chat completion
curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepbrain-router",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a short poem about the sea."}
],
"max_tokens": 256,
"temperature": 0.7
}'Response
{
"id": "chatcmpl-DLam6kYB250Quuq81KvS9iPRLnVGz",
"object": "chat.completion",
"created": 1774043853,
"model": "gpt-5-nano-2025-08-07",
"system_fingerprint": "fp_1133e9eff6",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The sea rolls in with ancient grace...",
"refusal": null,
"annotations": []
},
"finish_reason": "stop",
"logprobs": null,
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"self_harm": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" },
"protected_material_code": { "detected": false, "filtered": false },
"protected_material_text": { "detected": false, "filtered": false }
}
}
],
"prompt_filter_results": [
{
"prompt_index": 0,
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"jailbreak": { "detected": false, "filtered": false },
"self_harm": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" }
}
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 38,
"total_tokens": 62,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
}
}
}Streaming
Set stream: true. The connection stays open and emits newline-delimited SSE events, each a ChatCompletionChunk. The stream terminates with data: [DONE].
Stream sequence:
- First chunk — empty
delta, carriesprompt_filter_resultsat root level (prompt safety evaluated before generation begins) - Mid-stream chunks —
delta.contentcontains incremental text;content_filter_resultsper choice - Final chunk —
finish_reason: "stop", emptydelta: {} data: [DONE]— sentinel line, not JSON
data: {
"id": "chatcmpl-DLc8T...",
"object": "chat.completion.chunk",
"created": 1774043853,
"model": "gpt-5-nano-2025-08-07",
"choices": [{
"index": 0,
"delta": { "content": "The " },
"finish_reason": null,
"logprobs": null,
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"self_harm": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" }
}
}]
}Streaming chat
curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
"model": "deepbrain-router",
"messages": [{"role": "user", "content": "Tell me about the ocean."}],
"stream": true
}'SSE chunk
data: {
"id": "chatcmpl-DLc8T...",
"object": "chat.completion.chunk",
"created": 1774043853,
"model": "gpt-5-nano-2025-08-07",
"choices": [{
"index": 0,
"delta": { "content": "The ocean" },
"finish_reason": null,
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"self_harm": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" }
}
}]
}
data: [DONE]Tool Calling
You must always submit tool results back. Leaving a tool_calls message without a subsequent tool response will cause the next completion call to fail.
Pass a tools array of function definitions. When the model decides to call one, finish_reason is "tool_calls" and choices[0].message.tool_calls contains the invocation(s). Execute the function locally, then send the result back as a "tool" role message to complete the loop.
Request fields
tools— array ofChatCompletionTooldefinitions, each with afunctionobject containingname,description, andparameters(JSON Schema)tool_choice—"auto"(default),"none","required", or{ type: "function", function: { name } }to force a specific function
Response fields (Step 1)
choices[0].finish_reason— will be"tool_calls"when the model wants to invoke a functionchoices[0].message.tool_calls[]— array ofToolCallobjects, each withid,function.name, andfunction.arguments(JSON string)choices[0].message.content—nullwhenfinish_reasonistool_calls
Response fields (Step 2)
- Returns a normal
ChatCompletionwithfinish_reason: "stop"and the final answer inchoices[0].message.content
Possible errors
INVALID_PARAMETER— malformedtoolsarray or invalid JSON Schema inparametersCONTENT_POLICY_VIOLATION— tool call or result blocked by safety policyINFERENCE_ERROR— provider rejected the tool-calling requestINFERENCE_TIMEOUT— agentic loops with many tool turns can exceed the time limit; reduce turns ormax_tokens
Step 1 — request with tools
curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepbrain-router",
"messages": [{"role":"user","content":"Weather in Paris?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}],
"tool_choice": "auto"
}'The model returns a ChatCompletion object with finish_reason: "tool_calls" and choices[0].message.tool_calls populated.
Step 2 — submit tool result
curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepbrain-router",
"messages": [
{"role": "user", "content": "Weather in Paris?"},
{"role": "assistant", "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{"city":"Paris"}"}}]},
{"role": "tool", "tool_call_id": "call_abc123", "content": "{"temp":"18°C"}"}
]
}'Chat Completions In Action
The interactive spec below is generated directly from the Skytells OpenAPI schema. It covers every accepted parameter with its type, constraints, and default value. Use the Try it panel to send a live request straight from the browser — set your x-api-key in the auth field first.
Authorization
apiKey Your Skytells API key. Obtain one from Dashboard → API Keys.
In: header
Query Parameters
Force streaming via query param. Equivalent to stream: true in the body.
Request Body
application/json
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/json
application/json
application/json
application/json
application/json
application/json
application/json
application/json
curl -X POST "https://api.skytells.ai/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "deepbrain-router", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] }'{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1748000000,
"model": "deepbrain-router",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 38,
"total_tokens": 62
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}Expected Errors
You may encounter errors when calling this Chat API. Branch on error.error_id — never on error.message. See the Inference Error Reference for all error codes and retry guidance.
How is this guide?