Responses

Create Response

POST /v1/responses — full parameter reference with code examples, streaming events, tool calling, and OpenAPI spec.

Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search to use your own data as input for the model's response. The Responses API stores context server-side, use previous_response_id to chain turns without resending history. Set store: true to enable multi-turn continuity.

RESPONSE Returns a Response object, or an async stream of ResponsesStreamEvent objects when stream: true.

This API conforms to Inference API endpoints, and its request and response schemas are OpenAI-compatible with Skytells-specific safety additions.

See the Inference API overview for architecture and model catalog.


Request Body

modelrequiredstring

The model ID to use. See Models for available identifiers. Example: "skytells-3".

inputrequiredstring | InputItem[]

The input for this turn. Pass a plain string for a single user message, or a structured InputItem[] array for multi-modal or multi-part inputs (text + images).

// String shorthand
"What is in this image?"

// Structured array
[
  { "role": "user", "content": [
      { "type": "input_text", "text": "What is in this image?" },
      { "type": "input_image", "image_url": "https://example.com/photo.jpg" }
  ]}
]
previous_response_idstring

The id of a previous Response to continue from. When set, the server appends this request as the next turn — you do not need to resend earlier messages. Requires store: true on the prior response.

storeboolean

Whether to persist this response in the server-side conversation store. Must be true for previous_response_id chaining in future turns. Defaults to false.

streamboolean

Whether to stream the response as Server-Sent Events. Defaults to false. When true, returns a stream of ResponsesStreamEvent objects instead of a single Response.

instructionsstring

A system-level instruction prepended to the conversation. Equivalent to a system role message. Ignored if previous_response_id is set (the previous response's instructions persist).

max_output_tokensinteger

Maximum number of output tokens to generate. The request fails with a length finish reason if the model reaches this limit. Defaults to the model's maximum.

temperaturenumber

Sampling temperature, between 0 and 2. Values closer to 0 produce more deterministic output; higher values increase randomness. Defaults to 1.

top_pnumber

Nucleus sampling — only tokens within the top top_p probability mass are sampled. Values between 0 and 1. Do not set both temperature and top_p simultaneously. Defaults to 0.98.

toolsTool[]

A list of tools the model may call. Enabling tools allows the model to emit function_call output items.

[
  {
    "type": "function",
    "name": "get_weather",
    "description": "Get the current weather for a location.",
    "parameters": {
      "type": "object",
      "properties": {
        "location": { "type": "string" }
      },
      "required": ["location"]
    }
  }
]
tool_choicestring | object

Controls which tool (if any) the model must call. Values: "none", "auto", "required", or { "type": "function", "name": "tool_name" } to force a specific tool. Defaults to "auto" when tools are present.

userstring

An opaque identifier for the end-user. Used for abuse monitoring and rate-limit attribution. Does not affect model behavior.

truncationstring

Truncation strategy for the context window. Values: "auto" (server truncates oldest turns when the context window fills) or "disabled" (request fails if context limit is exceeded). Defaults to "disabled".

includestring[]

Additional response fields to include. Supported values: "usage", "reasoning.encrypted_content" (reasoning models). Defaults to [].

metadataobject

Up to 16 key-value string pairs attached to the response. Useful for logging and retrieval. Keys ≤ 64 characters, values ≤ 512 characters.

frequency_penaltynumber

Penalize tokens proportional to how often they have already appeared in the output. Values between -2.0 and 2.0. Positive values reduce repetition. Defaults to 0.

presence_penaltynumber

Penalize any token that has appeared at all in the output so far. Values between -2.0 and 2.0. Positive values encourage topic diversity. Defaults to 0.

parallel_tool_callsboolean

Whether the model may invoke multiple tools in a single turn. Defaults to true.

service_tierstring

Inference tier to use. "auto" lets Skytells select the optimal tier; "default" uses the standard inference layer. Defaults to "auto".

reasoningobject

Reasoning configuration for thinking models. Set effort to "none" | "low" | "medium" | "high" to control how deeply the model thinks before responding. Set summary to "auto" to include a reasoning summary in the output. Defaults to { "effort": "none", "summary": null }.

textobject

Text output format configuration. format.type accepts "text" (default) or "json_schema" for structured output. verbosity controls response length: "short" | "medium" (default) | "long".

Create a response

REST
curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "input": "Write a haiku about the ocean.",
  "store": true
}'

Response

200 OK
{
"id": "resp_063a6790fb3935fe0069bdc025784081939b4eaa76b9b0c2e0",
"object": "response",
"created_at": 1774043173,
"completed_at": 1774043176,
"status": "completed",
"background": false,
"model": "skytells-3",
"instructions": null,
"store": true,
"service_tier": "default",
"temperature": 1,
"top_p": 0.98,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_logprobs": 0,
"max_output_tokens": null,
"max_tool_calls": null,
"parallel_tool_calls": true,
"tool_choice": "auto",
"tools": [],
"truncation": "disabled",
"text": { "format": { "type": "text" }, "verbosity": "medium" },
"reasoning": { "effort": "none", "summary": null },
"previous_response_id": null,
"prompt_cache_key": null,
"prompt_cache_retention": null,
"safety_identifier": null,
"incomplete_details": null,
"error": null,
"user": null,
"metadata": {},
"output": [
  {
    "id": "msg_063a6790fb3935fe0069bdc027ebac8193922457a01eeaec82",
    "type": "message",
    "role": "assistant",
    "status": "completed",
    "phase": "final_answer",
    "content": [
      {
        "type": "output_text",
        "text": "Waves crash endlessly\nSalt and foam meet distant shore\nThe sea never sleeps",
        "annotations": [],
        "logprobs": []
      }
    ]
  }
],
"usage": {
  "input_tokens": 14,
  "input_tokens_details": { "cached_tokens": 0 },
  "output_tokens": 22,
  "output_tokens_details": { "reasoning_tokens": 0 },
  "total_tokens": 36
},
"content_filters": [
  {
    "blocked": false,
    "source_type": "prompt",
    "content_filter_raw": [],
    "content_filter_results": {
      "hate": { "filtered": false, "severity": "safe" },
      "violence": { "filtered": false, "severity": "safe" },
      "sexual": { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "jailbreak": { "detected": false, "filtered": false }
    },
    "content_filter_offsets": { "start_offset": 0, "end_offset": 50, "check_offset": 0 }
  },
  {
    "blocked": false,
    "source_type": "completion",
    "content_filter_raw": [],
    "content_filter_results": {
      "hate": { "filtered": false, "severity": "safe" },
      "violence": { "filtered": false, "severity": "safe" },
      "sexual": { "filtered": false, "severity": "safe" },
      "self_harm": { "filtered": false, "severity": "safe" },
      "protected_material_code": { "detected": false, "filtered": false },
      "protected_material_text": { "detected": false, "filtered": false }
    },
    "content_filter_offsets": { "start_offset": 0, "end_offset": 22, "check_offset": 0 }
  }
]
}

Streaming

Set stream: true to receive output as a sequence of Server-Sent Events. The Skytells SDK wraps the raw SSE stream in an AsyncIterable<ResponsesStreamEvent>.

The stream emits these event types in order:

EventWhen
response.createdServer acknowledges the request and creates a Response stub
response.in_progressGeneration is running
response.output_item.addedA new OutputItem starts (e.g., a new message)
response.content_part.addedA new content part starts within an output item
response.output_text.deltaIncremental text token(s)
response.output_text.annotation.addedA citation/annotation was added
response.output_text.doneThe full text of a content part is finalized
response.content_part.doneA content part is fully emitted
response.output_item.doneAn OutputItem is complete
response.completedThe full response is done — final Response object included

Streaming a response

REST
curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "input": "Tell me about the moon.",
  "stream": true,
  "store": true
}'

SSE events

Stream events
event: response.created
data: {"type":"response.created","response":{"id":"resp_08d695...","object":"response","created_at":1774043999,"status":"in_progress","background":false,"completed_at":null,"content_filters":null,"error":null,"frequency_penalty":0,"incomplete_details":null,"instructions":null,"max_output_tokens":null,"model":"skytells-3","output":[],"parallel_tool_calls":true,"presence_penalty":0,"previous_response_id":null,"reasoning":{"effort":"none","summary":null},"service_tier":"auto","store":true,"temperature":1,"text":{"format":{"type":"text"},"verbosity":"medium"},"tool_choice":"auto","tools":[],"top_p":0.98,"truncation":"disabled","usage":null,"user":null,"metadata":{}},"sequence_number":0}

event: response.in_progress
data: {"type":"response.in_progress","response":{"id":"resp_08d695...","status":"in_progress"},"sequence_number":1}

event: response.output_item.added
data: {"type":"response.output_item.added","item":{"id":"msg_08d695...","type":"message","status":"in_progress","content":[],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":2}

event: response.content_part.added
data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":""},"sequence_number":3}

event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":"The Moon","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":4}

event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":" orbits Earth","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":5}

event: response.output_text.done
data: {"type":"response.output_text.done","content_index":0,"item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":6,"text":"The Moon orbits Earth at an average distance of 384,400 km."}

event: response.content_part.done
data: {"type":"response.content_part.done","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."},"sequence_number":7}

event: response.output_item.done
data: {"type":"response.output_item.done","item":{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":8}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_08d695...","object":"response","status":"completed","completed_at":1774044000,"model":"skytells-3","output":[{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"}],"usage":{"input_tokens":19,"input_tokens_details":{"cached_tokens":0},"output_tokens":27,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":46},"content_filters":[{"blocked":false,"source_type":"prompt","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"detected":false,"filtered":false}}},{"blocked":false,"source_type":"completion","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_text":{"detected":false,"filtered":false}}}]},"sequence_number":9}

Multi-Turn Conversations

The Responses API stores conversation context server-side. Rather than resending the full message history on every request, pass previous_response_id to continue seamlessly from where the last turn ended.

Requirements:

  • Set store: true on every turn you want to chain from
  • previous_response_id must be the id of a stored Response from the same model

The server reconstructs the full conversation history automatically — only new tokens from the current turn are billed.

Return type

Each turn returns a Response object. Key fields to read:

FieldDescription
idStore this as previous_response_id for the next turn
previous_response_idEchoed back — confirms which turn was chained
output[0].content[0].textThe assistant's reply text for this turn
output[0].status"completed" on success; "incomplete" if max_output_tokens was hit
usage.input_tokensOnly tokens for the current turn's input — history is free
statusTop-level response status: "completed" | "incomplete" | "failed"

The usage.input_tokens on turn 2+ reflects only the new input you sent — the server-stored history does not count toward token billing again.

Possible errors

  • INVALID_PARAMETERprevious_response_id references a response that does not exist or was not stored with store: true
  • INFERENCE_ERROR — provider does not support stateful context for this model
  • INFERENCE_TIMEOUT — reduce max_output_tokens or split into fewer turns

Multi-turn conversation

REST
# Turn 1
R1=$(curl -s https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
  "model": "skytells-3",
  "instructions": "You are a helpful coding assistant.",
  "input": "What is a closure in JavaScript?",
  "store": true
}')
PREV_ID=$(echo "$R1" | jq -r '.id')

# Turn 2 — pass the previous response ID
BODY=$(jq -n --arg prev "$PREV_ID" '{
model: "skytells-3",
input: "Can you show me a practical example?",
previous_response_id: $prev,
store: true
}')

curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d "$BODY"

Read the response

Skytells SDK
// r2 is a Response object
console.log(r2.id);                          // save for next turn
console.log(r2.previous_response_id);        // confirms chaining
console.log(r2.output[0].content[0].text);   // assistant reply
console.log(r2.usage.input_tokens);          // only this turn's tokens

Tool Calling

Always submit tool results back using previous_response_id + store: true. Without it, the model loses the function call context and cannot generate a final answer.

Attach a tools array to the request. When the model wants to call a tool, it returns a function_call OutputItem instead of a text message. Run the function locally, then submit the result as a function_call_output input item referencing call_id.

The agentic loop:

  1. Call responses.create() with tools and store: true
  2. Model returns a function_call output item with name + arguments
  3. Execute the function locally
  4. Submit result via function_call_output, referencing call_id and previous_response_id
  5. Model replies with the final answer

Request fields

  • tools — array of function definitions, each with type: "function", name, description, and parameters (JSON Schema)
  • tool_choice"auto" (default), "none", "required", or { type: "function", name } to force a specific call
  • store: true — required on every turn to preserve tool call history server-side
  • previous_response_id — set this on the follow-up turn (Step 2) to chain context

Response fields (Step 1)

  • output[].type — will be "function_call" when the model wants to invoke a tool
  • output[].name — the function name the model chose to call
  • output[].arguments — JSON string of parameters to pass to your function
  • output[].call_id — opaque ID you must echo back in function_call_output

Response fields (Step 2)

  • Returns a normal Response with the final answer in output[0].content[0].text

Possible errors

Step 1 — request with tools

REST
curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "skytells-3",
  "input": "Weather in Paris?",
  "tools": [{
    "type": "function",
    "name": "get_weather",
    "description": "Get weather for a city",
    "parameters": {
      "type": "object",
      "properties": {"city": {"type": "string"}},
      "required": ["city"]
    }
  }],
  "store": true
}'

The model returns a Response with a function_call OutputItem in output instead of a text message.

Step 2 — submit tool result

REST
curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "model": "skytells-3",
  "input": [{
    "type": "function_call_output",
    "call_id": "call_abc123",
    "output": "{"temp":"18°C"}"
  }],
  "previous_response_id": "resp_toolcall_abc...",
  "store": true
}'

OpenAPI Reference

The interactive spec below is generated directly from the Skytells OpenAPI schema. It covers every accepted parameter with its type, constraints, and default value. Use the Try it panel to send a live request — set your x-api-key in the auth field first.

POST
/responses
x-api-key<token>

Your Skytells API key. Obtain one from Dashboard → API Keys.

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

Response Body

application/json

application/json

application/json

application/json

application/json

application/json

curl -X POST "https://api.skytells.ai/v1/responses" \  -H "Content-Type: application/json" \  -d '{    "model": "deepbrain-router",    "input": "What is machine learning?"  }'
{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1748000000,
  "model": "deepbrain-router",
  "status": "completed",
  "output_text": "Machine learning is a subset of AI...",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Machine learning is a subset of AI..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 41,
    "total_tokens": 53
  }
}
{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}
{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}
{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}
{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}
{
  "error": {
    "message": "The model 'unknown-model' was not found.",
    "type": "server_error",
    "code": "model_not_found",
    "error_id": "MODEL_NOT_FOUND",
    "status": 404,
    "param": "model",
    "request_id": "req_abc123xyz",
    "details": {
      "category": "request"
    }
  }
}

Expected Errors

You may encounter errors when calling this endpoint, Please refer to the Inference Errors documentation for a comprehensive list of error codes, their meanings, and recommended handling strategies. For retry guidance and error-handling code examples, see Inference Errors.

How is this guide?

On this page