Create Response

POST /v1/responses — full parameter reference with code examples, streaming events, tool calling, and OpenAPI spec.

POSThttps://api.skytells.ai/v1/responses

Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search to use your own data as input for the model's response. The Responses API stores context server-side, use previous_response_id to chain turns without resending history. Set store: true to enable multi-turn continuity.

RESPONSE Returns a Response object, or an async stream of ResponsesStreamEvent objects when stream: true.

This API conforms to Inference API endpoints, and its request and response schemas are OpenAI-compatible with Skytells-specific safety additions.

See the Inference API overview for architecture and model catalog.

Request Body

modelrequiredstring

The model ID to use. See Models for available identifiers. Example: "skytells-3".

inputrequiredstring | InputItem[]

The input for this turn. Pass a plain string for a single user message, or a structured InputItem[] array for multi-modal or multi-part inputs (text + images).

// String shorthand
"What is in this image?"

// Structured array
[
  { "role": "user", "content": [
      { "type": "input_text", "text": "What is in this image?" },
      { "type": "input_image", "image_url": "https://example.com/photo.jpg" }
  ]}
]

previous_response_idstring

The id of a previous Response to continue from. When set, the server appends this request as the next turn — you do not need to resend earlier messages. Requires store: true on the prior response.

storeboolean

Whether to persist this response in the server-side conversation store. Must be true for previous_response_id chaining in future turns. Defaults to false.

streamboolean

Whether to stream the response as Server-Sent Events. Defaults to false. When true, returns a stream of ResponsesStreamEvent objects instead of a single Response.

instructionsstring

A system-level instruction prepended to the conversation. Equivalent to a system role message. Ignored if previous_response_id is set (the previous response's instructions persist).

max_output_tokensinteger

Maximum number of output tokens to generate. The request fails with a length finish reason if the model reaches this limit. Defaults to the model's maximum.

temperaturenumber

Sampling temperature, between 0 and 2. Values closer to 0 produce more deterministic output; higher values increase randomness. Defaults to 1.

top_pnumber

Nucleus sampling — only tokens within the top top_p probability mass are sampled. Values between 0 and 1. Do not set both temperature and top_p simultaneously. Defaults to 0.98.

toolsTool[]

A list of tools the model may call. Enabling tools allows the model to emit function_call output items.

[
  {
    "type": "function",
    "name": "get_weather",
    "description": "Get the current weather for a location.",
    "parameters": {
      "type": "object",
      "properties": {
        "location": { "type": "string" }
      },
      "required": ["location"]
    }
  }
]

tool_choicestring | object

Controls which tool (if any) the model must call. Values: "none", "auto", "required", or { "type": "function", "name": "tool_name" } to force a specific tool. Defaults to "auto" when tools are present.

userstring

An opaque identifier for the end-user. Used for abuse monitoring and rate-limit attribution. Does not affect model behavior.

truncationstring

Truncation strategy for the context window. Values: "auto" (server truncates oldest turns when the context window fills) or "disabled" (request fails if context limit is exceeded). Defaults to "disabled".

includestring[]

Additional response fields to include. Supported values: "usage", "reasoning.encrypted_content" (reasoning models). Defaults to [].

metadataobject

Up to 16 key-value string pairs attached to the response. Useful for logging and retrieval. Keys ≤ 64 characters, values ≤ 512 characters.

frequency_penaltynumber

Penalize tokens proportional to how often they have already appeared in the output. Values between -2.0 and 2.0. Positive values reduce repetition. Defaults to 0.

presence_penaltynumber

Penalize any token that has appeared at all in the output so far. Values between -2.0 and 2.0. Positive values encourage topic diversity. Defaults to 0.

parallel_tool_callsboolean

Whether the model may invoke multiple tools in a single turn. Defaults to true.

service_tierstring

Inference tier to use. "auto" lets Skytells select the optimal tier; "default" uses the standard inference layer. Defaults to "auto".

reasoningobject

Reasoning configuration for thinking models. Set effort to "none" | "low" | "medium" | "high" to control how deeply the model thinks before responding. Set summary to "auto" to include a reasoning summary in the output. Defaults to { "effort": "none", "summary": null }.

textobject

Text output format configuration. format.type accepts "text" (default) or "json_schema" for structured output. verbosity controls response length: "short" | "medium" (default) | "long".

Create a response

REST

curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "input": "Write a haiku about the ocean.",
  "store": true
}'

Response

200 OK

{
"id": "resp_063a6790fb3935fe0069bdc025784081939b4eaa76b9b0c2e0",
"object": "response",
"created_at": 1774043173,
"completed_at": 1774043176,
"status": "completed",
"background": false,
"model": "skytells-3",
"instructions": null,
"store": true,
"service_tier": "default",
"temperature": 1,
"top_p": 0.98,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_logprobs": 0,
"max_output_tokens": null,
"max_tool_calls": null,
"parallel_tool_calls": true,
"tool_choice": "auto",
"tools": [],
"truncation": "disabled",
"text": { "format": { "type": "text" }, "verbosity": "medium" },
"reasoning": { "effort": "none", "summary": null },
"previous_response_id": null,
"prompt_cache_key": null,
"prompt_cache_retention": null,
"safety_identifier": null,
"incomplete_details": null,
"error": null,
"user": null,
"metadata": {},
"output": [
  {
    "id": "msg_063a6790fb3935fe0069bdc027ebac8193922457a01eeaec82",
    "type": "message",
    "role": "assistant",
    "status": "completed",
    "phase": "final_answer",
    "content": [
      {
        "type": "output_text",
        "text": "Waves crash endlessly\nSalt and foam meet distant shore\nThe sea never sleeps",
        "annotations": [],
        "logprobs": []
      }
    ]
  }
],
"usage": {
  "input_tokens": 14,
  "input_tokens_details": { "cached_tokens": 0 },
  "output_tokens": 22,
  "output_tokens_details": { "reasoning_tokens": 0 },
  "total_tokens": 36
},
"content_filters": [
  {
    "blocked": false,
    "source_type": "prompt",
    "content_filter_raw": [],
    "content_filter_results": {
      "hate": { "filtered": false, "severity": "safe" },
      "violence": { "filtered": false, "severity": "safe" },
      "sexual": { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "jailbreak": { "detected": false, "filtered": false }
    },
    "content_filter_offsets": { "start_offset": 0, "end_offset": 50, "check_offset": 0 }
  },
  {
    "blocked": false,
    "source_type": "completion",
    "content_filter_raw": [],
    "content_filter_results": {
      "hate": { "filtered": false, "severity": "safe" },
      "violence": { "filtered": false, "severity": "safe" },
      "sexual": { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "protected_material_code": { "detected": false, "filtered": false },
      "protected_material_text": { "detected": false, "filtered": false }
    },
    "content_filter_offsets": { "start_offset": 0, "end_offset": 22, "check_offset": 0 }
  }
]
}

Streaming

Set stream: true to receive output as a sequence of Server-Sent Events. The Skytells SDK wraps the raw SSE stream in an AsyncIterable<ResponsesStreamEvent>.

The stream emits these event types in order:

Event	When
`response.created`	Server acknowledges the request and creates a `Response` stub
`response.in_progress`	Generation is running
`response.output_item.added`	A new `OutputItem` starts (e.g., a new message)
`response.content_part.added`	A new content part starts within an output item
`response.output_text.delta`	Incremental text token(s)
`response.output_text.annotation.added`	A citation/annotation was added
`response.output_text.done`	The full text of a content part is finalized
`response.content_part.done`	A content part is fully emitted
`response.output_item.done`	An `OutputItem` is complete
`response.completed`	The full response is done — final `Response` object included

Streaming a response

REST

curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "input": "Tell me about the moon.",
  "stream": true,
  "store": true
}'

SSE events

Stream events

event: response.created
data: {"type":"response.created","response":{"id":"resp_08d695...","object":"response","created_at":1774043999,"status":"in_progress","background":false,"completed_at":null,"content_filters":null,"error":null,"frequency_penalty":0,"incomplete_details":null,"instructions":null,"max_output_tokens":null,"model":"skytells-3","output":[],"parallel_tool_calls":true,"presence_penalty":0,"previous_response_id":null,"reasoning":{"effort":"none","summary":null},"service_tier":"auto","store":true,"temperature":1,"text":{"format":{"type":"text"},"verbosity":"medium"},"tool_choice":"auto","tools":[],"top_p":0.98,"truncation":"disabled","usage":null,"user":null,"metadata":{}},"sequence_number":0}

event: response.in_progress
data: {"type":"response.in_progress","response":{"id":"resp_08d695...","status":"in_progress"},"sequence_number":1}

event: response.output_item.added
data: {"type":"response.output_item.added","item":{"id":"msg_08d695...","type":"message","status":"in_progress","content":[],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":2}

event: response.content_part.added
data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":""},"sequence_number":3}

event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":"The Moon","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":4}

event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":" orbits Earth","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":5}

event: response.output_text.done
data: {"type":"response.output_text.done","content_index":0,"item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":6,"text":"The Moon orbits Earth at an average distance of 384,400 km."}

event: response.content_part.done
data: {"type":"response.content_part.done","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."},"sequence_number":7}

event: response.output_item.done
data: {"type":"response.output_item.done","item":{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":8}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_08d695...","object":"response","status":"completed","completed_at":1774044000,"model":"skytells-3","output":[{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"}],"usage":{"input_tokens":19,"input_tokens_details":{"cached_tokens":0},"output_tokens":27,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":46},"content_filters":[{"blocked":false,"source_type":"prompt","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"detected":false,"filtered":false}}},{"blocked":false,"source_type":"completion","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_text":{"detected":false,"filtered":false}}}]},"sequence_number":9}

Multi-Turn Conversations

The Responses API stores conversation context server-side. Rather than resending the full message history on every request, pass previous_response_id to continue seamlessly from where the last turn ended.

Requirements:

Set store: true on every turn you want to chain from
previous_response_id must be the id of a stored Response from the same model

The server reconstructs the full conversation history automatically — only new tokens from the current turn are billed.

Return type

Each turn returns a Response object. Key fields to read:

Field	Description
`id`	Store this as `previous_response_id` for the next turn
`previous_response_id`	Echoed back — confirms which turn was chained
`output[0].content[0].text`	The assistant's reply text for this turn
`output[0].status`	`"completed"` on success; `"incomplete"` if `max_output_tokens` was hit
`usage.input_tokens`	Only tokens for the current turn's input — history is free
`status`	Top-level response status: `"completed"` \| `"incomplete"` \| `"failed"`

The usage.input_tokens on turn 2+ reflects only the new input you sent — the server-stored history does not count toward token billing again.

Possible errors

INVALID_PARAMETER — previous_response_id references a response that does not exist or was not stored with store: true
INFERENCE_ERROR — provider does not support stateful context for this model
INFERENCE_TIMEOUT — reduce max_output_tokens or split into fewer turns

Multi-turn conversation

REST

# Turn 1
R1=$(curl -s https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "instructions": "You are a helpful coding assistant.",
  "input": "What is a closure in JavaScript?",
  "store": true
}')
PREV_ID=$(echo "$R1" | jq -r '.id')

# Turn 2 — pass the previous response ID
BODY=$(jq -n --arg prev "$PREV_ID" '{
model: "skytells-3",
input: "Can you show me a practical example?",
previous_response_id: $prev,
store: true
}')

curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d "$BODY"

Read the response

Skytells SDK

// r2 is a Response object
console.log(r2.id);                          // save for next turn
console.log(r2.previous_response_id);        // confirms chaining
console.log(r2.output[0].content[0].text);   // assistant reply
console.log(r2.usage.input_tokens);          // only this turn's tokens

Tool Calling

Always submit tool results back using previous_response_id + store: true. Without it, the model loses the function call context and cannot generate a final answer.

Attach a tools array to the request. When the model wants to call a tool, it returns a function_call OutputItem instead of a text message. Run the function locally, then submit the result as a function_call_output input item referencing call_id.

The agentic loop:

Call responses.create() with tools and store: true
Model returns a function_call output item with name + arguments
Execute the function locally
Submit result via function_call_output, referencing call_id and previous_response_id
Model replies with the final answer

Request fields

tools — array of function definitions, each with type: "function", name, description, and parameters (JSON Schema)
tool_choice — "auto" (default), "none", "required", or { type: "function", name } to force a specific call
store: true — required on every turn to preserve tool call history server-side
previous_response_id — set this on the follow-up turn (Step 2) to chain context

Response fields (Step 1)

output[].type — will be "function_call" when the model wants to invoke a tool
output[].name — the function name the model chose to call
output[].arguments — JSON string of parameters to pass to your function
output[].call_id — opaque ID you must echo back in function_call_output

Response fields (Step 2)

Returns a normal Response with the final answer in output[0].content[0].text

Possible errors

INVALID_PARAMETER — malformed tools array or invalid JSON Schema in parameters
CONTENT_POLICY_VIOLATION — tool call or result blocked by safety policy
INFERENCE_ERROR — provider does not support tool calling for this model
INFERENCE_TIMEOUT — multi-turn agentic loops can exceed the time limit; reduce turns or max_output_tokens

Step 1 — request with tools

REST

curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "skytells-3",
  "input": "Weather in Paris?",
  "tools": [{
    "type": "function",
    "name": "get_weather",
    "description": "Get weather for a city",
    "parameters": {
      "type": "object",
      "properties": {"city": {"type": "string"}},
      "required": ["city"]
    }
  }],
  "store": true
}'

The model returns a Response with a function_call OutputItem in output instead of a text message.

Step 2 — submit tool result

REST

curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "skytells-3",
  "input": [{
    "type": "function_call_output",
    "call_id": "call_abc123",
    "output": "{"temp":"18°C"}"
  }],
  "previous_response_id": "resp_toolcall_abc...",
  "store": true
}'

OpenAPI Reference

The interactive spec below is generated directly from the Skytells OpenAPI schema. It covers every accepted parameter with its type, constraints, and default value. Use the Try it panel to send a live request — set your x-api-key in the auth field first.

Authorization

apiKey

x-api-key<token>

Your Skytells API key. Obtain one from Dashboard → API Keys.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

model*string

Value in"deepbrain-router" | "gpt-5" | "gpt-5.4"

input*|array<>

The input for the response. A string or array of message objects.

previous_response_id?string

ID of a prior response to continue the conversation.

stream?boolean

Defaultfalse

max_output_tokens?integer

Default8192

temperature?number

Default0.7

Range0 <= value <= 2

top_p?number

Default0.95

instructions?string

System-level instructions prepended to the conversation.

Response Body

`application/json`

curl -X POST "https://api.skytells.ai/v1/responses" \  -H "Content-Type: application/json" \  -d '{    "model": "deepbrain-router",    "input": "What is machine learning?"  }'

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1748000000,
  "model": "deepbrain-router",
  "status": "completed",
  "output_text": "Machine learning is a subset of AI...",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Machine learning is a subset of AI..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 41,
    "total_tokens": 53
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

Expected Errors

You may encounter errors when calling this endpoint, Please refer to the Inference Errors documentation for a comprehensive list of error codes, their meanings, and recommended handling strategies. For retry guidance and error-handling code examples, see Inference Errors.

How is this guide?

Create a response

Response

Streaming a response

SSE events

Multi-turn conversation

Read the response

Step 1 — request with tools

Step 2 — submit tool result

200application/json

400application/json

401application/json

402application/json

429application/json

500application/json

On this page

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`

`application/json`