Create Response
POST /v1/responses — full parameter reference with code examples, streaming events, tool calling, and OpenAPI spec.
Creates a model response. Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search to use your own data as input for the model's response. The Responses API stores context server-side, use previous_response_id to chain turns without resending history. Set store: true to enable multi-turn continuity.
RESPONSE Returns a Response object, or an async stream of ResponsesStreamEvent objects when stream: true.
This API conforms to Inference API endpoints, and its request and response schemas are OpenAI-compatible with Skytells-specific safety additions.
See the Inference API overview for architecture and model catalog.
Request Body
modelrequiredstring
The model ID to use. See Models for available identifiers. Example: "skytells-3".
inputrequiredstring | InputItem[]
The input for this turn. Pass a plain string for a single user message, or a structured InputItem[] array for multi-modal or multi-part inputs (text + images).
// String shorthand
"What is in this image?"
// Structured array
[
{ "role": "user", "content": [
{ "type": "input_text", "text": "What is in this image?" },
{ "type": "input_image", "image_url": "https://example.com/photo.jpg" }
]}
]previous_response_idstring
The id of a previous Response to continue from. When set, the server appends this request as the next turn — you do not need to resend earlier messages. Requires store: true on the prior response.
storeboolean
Whether to persist this response in the server-side conversation store. Must be true for previous_response_id chaining in future turns. Defaults to false.
streamboolean
Whether to stream the response as Server-Sent Events. Defaults to false. When true, returns a stream of ResponsesStreamEvent objects instead of a single Response.
instructionsstring
A system-level instruction prepended to the conversation. Equivalent to a system role message. Ignored if previous_response_id is set (the previous response's instructions persist).
max_output_tokensinteger
Maximum number of output tokens to generate. The request fails with a length finish reason if the model reaches this limit. Defaults to the model's maximum.
temperaturenumber
Sampling temperature, between 0 and 2. Values closer to 0 produce more deterministic output; higher values increase randomness. Defaults to 1.
top_pnumber
Nucleus sampling — only tokens within the top top_p probability mass are sampled. Values between 0 and 1. Do not set both temperature and top_p simultaneously. Defaults to 0.98.
toolsTool[]
A list of tools the model may call. Enabling tools allows the model to emit function_call output items.
[
{
"type": "function",
"name": "get_weather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
]tool_choicestring | object
Controls which tool (if any) the model must call. Values: "none", "auto", "required", or { "type": "function", "name": "tool_name" } to force a specific tool. Defaults to "auto" when tools are present.
userstring
An opaque identifier for the end-user. Used for abuse monitoring and rate-limit attribution. Does not affect model behavior.
truncationstring
Truncation strategy for the context window. Values: "auto" (server truncates oldest turns when the context window fills) or "disabled" (request fails if context limit is exceeded). Defaults to "disabled".
includestring[]
Additional response fields to include. Supported values: "usage", "reasoning.encrypted_content" (reasoning models). Defaults to [].
metadataobject
Up to 16 key-value string pairs attached to the response. Useful for logging and retrieval. Keys ≤ 64 characters, values ≤ 512 characters.
frequency_penaltynumber
Penalize tokens proportional to how often they have already appeared in the output. Values between -2.0 and 2.0. Positive values reduce repetition. Defaults to 0.
presence_penaltynumber
Penalize any token that has appeared at all in the output so far. Values between -2.0 and 2.0. Positive values encourage topic diversity. Defaults to 0.
parallel_tool_callsboolean
Whether the model may invoke multiple tools in a single turn. Defaults to true.
service_tierstring
Inference tier to use. "auto" lets Skytells select the optimal tier; "default" uses the standard inference layer. Defaults to "auto".
reasoningobject
Reasoning configuration for thinking models. Set effort to "none" | "low" | "medium" | "high" to control how deeply the model thinks before responding. Set summary to "auto" to include a reasoning summary in the output. Defaults to { "effort": "none", "summary": null }.
textobject
Text output format configuration. format.type accepts "text" (default) or "json_schema" for structured output. verbosity controls response length: "short" | "medium" (default) | "long".
Create a response
curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
"model": "skytells-3",
"input": "Write a haiku about the ocean.",
"store": true
}'Response
{
"id": "resp_063a6790fb3935fe0069bdc025784081939b4eaa76b9b0c2e0",
"object": "response",
"created_at": 1774043173,
"completed_at": 1774043176,
"status": "completed",
"background": false,
"model": "skytells-3",
"instructions": null,
"store": true,
"service_tier": "default",
"temperature": 1,
"top_p": 0.98,
"frequency_penalty": 0,
"presence_penalty": 0,
"top_logprobs": 0,
"max_output_tokens": null,
"max_tool_calls": null,
"parallel_tool_calls": true,
"tool_choice": "auto",
"tools": [],
"truncation": "disabled",
"text": { "format": { "type": "text" }, "verbosity": "medium" },
"reasoning": { "effort": "none", "summary": null },
"previous_response_id": null,
"prompt_cache_key": null,
"prompt_cache_retention": null,
"safety_identifier": null,
"incomplete_details": null,
"error": null,
"user": null,
"metadata": {},
"output": [
{
"id": "msg_063a6790fb3935fe0069bdc027ebac8193922457a01eeaec82",
"type": "message",
"role": "assistant",
"status": "completed",
"phase": "final_answer",
"content": [
{
"type": "output_text",
"text": "Waves crash endlessly\nSalt and foam meet distant shore\nThe sea never sleeps",
"annotations": [],
"logprobs": []
}
]
}
],
"usage": {
"input_tokens": 14,
"input_tokens_details": { "cached_tokens": 0 },
"output_tokens": 22,
"output_tokens_details": { "reasoning_tokens": 0 },
"total_tokens": 36
},
"content_filters": [
{
"blocked": false,
"source_type": "prompt",
"content_filter_raw": [],
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"self_harm": { "filtered": false, "severity": "safe" },
"jailbreak": { "detected": false, "filtered": false }
},
"content_filter_offsets": { "start_offset": 0, "end_offset": 50, "check_offset": 0 }
},
{
"blocked": false,
"source_type": "completion",
"content_filter_raw": [],
"content_filter_results": {
"hate": { "filtered": false, "severity": "safe" },
"violence": { "filtered": false, "severity": "safe" },
"sexual": { "filtered": false, "severity": "safe" },
"self_harm": { "filtered": false, "severity": "safe" },
"protected_material_code": { "detected": false, "filtered": false },
"protected_material_text": { "detected": false, "filtered": false }
},
"content_filter_offsets": { "start_offset": 0, "end_offset": 22, "check_offset": 0 }
}
]
}Streaming
Set stream: true to receive output as a sequence of Server-Sent Events. The Skytells SDK wraps the raw SSE stream in an AsyncIterable<ResponsesStreamEvent>.
The stream emits these event types in order:
| Event | When |
|---|---|
response.created | Server acknowledges the request and creates a Response stub |
response.in_progress | Generation is running |
response.output_item.added | A new OutputItem starts (e.g., a new message) |
response.content_part.added | A new content part starts within an output item |
response.output_text.delta | Incremental text token(s) |
response.output_text.annotation.added | A citation/annotation was added |
response.output_text.done | The full text of a content part is finalized |
response.content_part.done | A content part is fully emitted |
response.output_item.done | An OutputItem is complete |
response.completed | The full response is done — final Response object included |
Streaming a response
curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
"model": "skytells-3",
"input": "Tell me about the moon.",
"stream": true,
"store": true
}'SSE events
event: response.created
data: {"type":"response.created","response":{"id":"resp_08d695...","object":"response","created_at":1774043999,"status":"in_progress","background":false,"completed_at":null,"content_filters":null,"error":null,"frequency_penalty":0,"incomplete_details":null,"instructions":null,"max_output_tokens":null,"model":"skytells-3","output":[],"parallel_tool_calls":true,"presence_penalty":0,"previous_response_id":null,"reasoning":{"effort":"none","summary":null},"service_tier":"auto","store":true,"temperature":1,"text":{"format":{"type":"text"},"verbosity":"medium"},"tool_choice":"auto","tools":[],"top_p":0.98,"truncation":"disabled","usage":null,"user":null,"metadata":{}},"sequence_number":0}
event: response.in_progress
data: {"type":"response.in_progress","response":{"id":"resp_08d695...","status":"in_progress"},"sequence_number":1}
event: response.output_item.added
data: {"type":"response.output_item.added","item":{"id":"msg_08d695...","type":"message","status":"in_progress","content":[],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":2}
event: response.content_part.added
data: {"type":"response.content_part.added","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":""},"sequence_number":3}
event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":"The Moon","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":4}
event: response.output_text.delta
data: {"type":"response.output_text.delta","content_index":0,"delta":" orbits Earth","item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":5}
event: response.output_text.done
data: {"type":"response.output_text.done","content_index":0,"item_id":"msg_08d695...","logprobs":[],"output_index":0,"sequence_number":6,"text":"The Moon orbits Earth at an average distance of 384,400 km."}
event: response.content_part.done
data: {"type":"response.content_part.done","content_index":0,"item_id":"msg_08d695...","output_index":0,"part":{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."},"sequence_number":7}
event: response.output_item.done
data: {"type":"response.output_item.done","item":{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"},"output_index":0,"sequence_number":8}
event: response.completed
data: {"type":"response.completed","response":{"id":"resp_08d695...","object":"response","status":"completed","completed_at":1774044000,"model":"skytells-3","output":[{"id":"msg_08d695...","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"logprobs":[],"text":"The Moon orbits Earth at an average distance of 384,400 km."}],"phase":"final_answer","role":"assistant"}],"usage":{"input_tokens":19,"input_tokens_details":{"cached_tokens":0},"output_tokens":27,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":46},"content_filters":[{"blocked":false,"source_type":"prompt","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"detected":false,"filtered":false}}},{"blocked":false,"source_type":"completion","content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_text":{"detected":false,"filtered":false}}}]},"sequence_number":9}Multi-Turn Conversations
The Responses API stores conversation context server-side. Rather than resending the full message history on every request, pass previous_response_id to continue seamlessly from where the last turn ended.
Requirements:
- Set
store: trueon every turn you want to chain from previous_response_idmust be theidof a storedResponsefrom the same model
The server reconstructs the full conversation history automatically — only new tokens from the current turn are billed.
Return type
Each turn returns a Response object. Key fields to read:
| Field | Description |
|---|---|
id | Store this as previous_response_id for the next turn |
previous_response_id | Echoed back — confirms which turn was chained |
output[0].content[0].text | The assistant's reply text for this turn |
output[0].status | "completed" on success; "incomplete" if max_output_tokens was hit |
usage.input_tokens | Only tokens for the current turn's input — history is free |
status | Top-level response status: "completed" | "incomplete" | "failed" |
The usage.input_tokens on turn 2+ reflects only the new input you sent — the server-stored history does not count toward token billing again.
Possible errors
INVALID_PARAMETER—previous_response_idreferences a response that does not exist or was not stored withstore: trueINFERENCE_ERROR— provider does not support stateful context for this modelINFERENCE_TIMEOUT— reducemax_output_tokensor split into fewer turns
Multi-turn conversation
# Turn 1
R1=$(curl -s https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d '{
"model": "skytells-3",
"instructions": "You are a helpful coding assistant.",
"input": "What is a closure in JavaScript?",
"store": true
}')
PREV_ID=$(echo "$R1" | jq -r '.id')
# Turn 2 — pass the previous response ID
BODY=$(jq -n --arg prev "$PREV_ID" '{
model: "skytells-3",
input: "Can you show me a practical example?",
previous_response_id: $prev,
store: true
}')
curl https://api.skytells.ai/v1/responses \
-H "Content-Type: application/json" \
-H "x-api-key: $SKYTELLS_API_KEY" \
-d "$BODY"Read the response
// r2 is a Response object
console.log(r2.id); // save for next turn
console.log(r2.previous_response_id); // confirms chaining
console.log(r2.output[0].content[0].text); // assistant reply
console.log(r2.usage.input_tokens); // only this turn's tokensTool Calling
Always submit tool results back using previous_response_id + store: true. Without it, the model loses the function call context and cannot generate a final answer.
Attach a tools array to the request. When the model wants to call a tool, it returns a function_call OutputItem instead of a text message. Run the function locally, then submit the result as a function_call_output input item referencing call_id.
The agentic loop:
- Call
responses.create()withtoolsandstore: true - Model returns a
function_calloutput item withname+arguments - Execute the function locally
- Submit result via
function_call_output, referencingcall_idandprevious_response_id - Model replies with the final answer
Request fields
tools— array of function definitions, each withtype: "function",name,description, andparameters(JSON Schema)tool_choice—"auto"(default),"none","required", or{ type: "function", name }to force a specific callstore: true— required on every turn to preserve tool call history server-sideprevious_response_id— set this on the follow-up turn (Step 2) to chain context
Response fields (Step 1)
output[].type— will be"function_call"when the model wants to invoke a tooloutput[].name— the function name the model chose to calloutput[].arguments— JSON string of parameters to pass to your functionoutput[].call_id— opaque ID you must echo back infunction_call_output
Response fields (Step 2)
- Returns a normal
Responsewith the final answer inoutput[0].content[0].text
Possible errors
INVALID_PARAMETER— malformedtoolsarray or invalid JSON Schema inparametersCONTENT_POLICY_VIOLATION— tool call or result blocked by safety policyINFERENCE_ERROR— provider does not support tool calling for this modelINFERENCE_TIMEOUT— multi-turn agentic loops can exceed the time limit; reduce turns ormax_output_tokens
Step 1 — request with tools
curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "skytells-3",
"input": "Weather in Paris?",
"tools": [{
"type": "function",
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}],
"store": true
}'The model returns a Response with a function_call OutputItem in output instead of a text message.
Step 2 — submit tool result
curl https://api.skytells.ai/v1/responses \
-H "x-api-key: $SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "skytells-3",
"input": [{
"type": "function_call_output",
"call_id": "call_abc123",
"output": "{"temp":"18°C"}"
}],
"previous_response_id": "resp_toolcall_abc...",
"store": true
}'OpenAPI Reference
The interactive spec below is generated directly from the Skytells OpenAPI schema. It covers every accepted parameter with its type, constraints, and default value. Use the Try it panel to send a live request — set your x-api-key in the auth field first.
Authorization
apiKey Your Skytells API key. Obtain one from Dashboard → API Keys.
In: header
Request Body
application/json
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/json
application/json
application/json
application/json
application/json
curl -X POST "https://api.skytells.ai/v1/responses" \ -H "Content-Type: application/json" \ -d '{ "model": "deepbrain-router", "input": "What is machine learning?" }'{
"id": "resp_abc123",
"object": "response",
"created_at": 1748000000,
"model": "deepbrain-router",
"status": "completed",
"output_text": "Machine learning is a subset of AI...",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "Machine learning is a subset of AI..."
}
]
}
],
"usage": {
"input_tokens": 12,
"output_tokens": 41,
"total_tokens": 53
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}{
"error": {
"message": "The model 'unknown-model' was not found.",
"type": "server_error",
"code": "model_not_found",
"error_id": "MODEL_NOT_FOUND",
"status": 404,
"param": "model",
"request_id": "req_abc123xyz",
"details": {
"category": "request"
}
}
}Expected Errors
You may encounter errors when calling this endpoint, Please refer to the Inference Errors documentation for a comprehensive list of error codes, their meanings, and recommended handling strategies. For retry guidance and error-handling code examples, see Inference Errors.
How is this guide?