Responses API
Reference overview for the Responses sub-API — POST /v1/responses.
Responses API
The Responses API is a stateful, multi-turn conversation API with persistent memory between turns. It is a modern alternative to Chat Completions for agentic workflows, letting you chain requests using previous_response_id and store context server-side. This enables advanced use cases like multi-turn agents, tool use, and persistent context across calls.
It's fully compatible with Skytells models and offers a rich streaming event protocol. For stateless or OpenAI-compatible chat, see the Chat API. For safety features, see Safety and Responsible AI.
- Endpoint:
POST /v1/responses - SDK access:
client.responses.create(params) - Streaming: send
"stream": true— returnsAsyncIterable<ResponsesStreamEvent>(SDK) or SSE (REST) - OpenAI-compatible: Partial — use Chat API for OpenAI SDKs
How it works
The Responses API is designed for stateful, multi-turn conversations. Each call creates a new response object, which can be referenced in future calls via previous_response_id. This allows the API to maintain memory and context between turns, unlike stateless chat completions. You can store context server-side (store: true), chain responses, and receive output as a complete Response or as a stream of ResponsesStreamEvent objects for real-time applications.
As part of Skytells’ commitment to safety, responsible AI, and the proper use of AI technologies, the API includes built-in content safety evaluations for both input and generated outputs, helping you detect, monitor, and filter harmful content.
When to use the Responses API
| Scenario | Why Use Responses API? |
|---|---|
| Building multi-turn, stateful agents | Server-side memory with previous_response_id enables persistent context and long-running conversations |
| Chaining tool calls and outputs | Each response can store structured outputs, tool traces, and be referenced in future calls |
| Integrating with workflows that require audit trails | Every turn is stored as a unique object, making it easy to track, audit, and analyze conversations |
| Streaming rich event data | Supports advanced streaming with incremental events and output types beyond plain text |
| Fine-grained control over context storage | Choose when to persist or discard context with the store parameter |
| Need for advanced output formats | Supports multimodal, tool, and custom output items, not just text completions |
Create a Response
Full endpoint reference — every request parameter, response shape, streaming format, and multi-client code examples.
Response Objects
Named type definitions: Response, OutputItem, ContentFilter, ResponsesStreamEvent, and more.
Quick Example
Create a response
import Skytells from 'skytells';
const client = Skytells(process.env.SKYTELLS_API_KEY);
const response = await client.responses.create({
model: 'skytells-3',
input: 'Explain quantum entanglement in simple terms.',
store: true,
});
console.log(response.output[0].content[0].text);
// "Quantum entanglement is a phenomenon where..."
// Continue the conversation
const follow_up = await client.responses.create({
model: 'skytells-3',
input: 'Can you give me an analogy?',
previous_response_id: response.id,
store: true,
});
console.log(follow_up.output[0].content[0].text);Returns a Response object, or a stream of ResponsesStreamEvent objects when stream: true.
Responses API FAQs
Which models can I use with the Responses API?
The Responses API supports all Skytells's language models, including general-purpose and specialized models. See the Model Catalog for the latest list. Specify the desired model in the
modelparameter of your request.
Can I use the Responses API for real-time applications?
Yes! By setting
"stream": truein your request, you can receive the model's response as a stream of SSE chunks, allowing you to display the response incrementally as it's generated. The Skytells SDK provides an async iterable for easy streaming.
How does the Responses API handle content safety?
The Responses API includes built-in content safety evaluations for both input and generated outputs. Each response includes
content_filter_resultsand related fields, which categorize and rate the severity of any potentially harmful content. You can use this information to implement your own filtering logic or to monitor the safety of interactions. For more details, see the Safety Types and Responsible AI documentation.
Is the Responses API compatible with the OpenAI SDK?
Not directly. The OpenAI SDK does not natively support the Responses API. For OpenAI-compatible access, use the Chat API instead. The Skytells SDK provides full support for the Responses API.
How to debug or monitor Responses API usage?
Skytells provides detailed response objects that include usage information (token counts), content safety evaluations, and error messages when applicable. You can log these details in your application for monitoring and debugging purposes. Additionally, Skytells' dashboard offers analytics and insights into your API usage, including breakdowns of which models you're using, how many tokens are being processed, and any safety filter triggers. For more information on the response schema and safety features, see the Response Objects reference and the Safety Types documentation.
How is this guide?
Chat ObjectsREF
Type definitions for every object the Chat Completions API emits — ChatCompletion, ChatCompletionChunk, ChatMessage, ContentFilterResults, PromptFilterResults, ChatCompletionUsage.
Create ResponsePOST
POST /v1/responses — full parameter reference with code examples, streaming events, tool calling, and OpenAPI spec.