Responses API

The Responses API is a stateful, multi-turn conversation API with persistent memory between turns. It is a modern alternative to Chat Completions for agentic workflows, letting you chain requests using previous_response_id and store context server-side. This enables advanced use cases like multi-turn agents, tool use, and persistent context across calls.

It's fully compatible with Skytells models and offers a rich streaming event protocol. For stateless or OpenAI-compatible chat, see the Chat API. For safety features, see Safety and Responsible AI.

Endpoint: POST /v1/responses
SDK access: client.responses.create(params)
Streaming: send "stream": true — returns AsyncIterable<ResponsesStreamEvent> (SDK) or SSE (REST)
OpenAI-compatible: Partial — use Chat API for OpenAI SDKs

How it works

The Responses API is designed for stateful, multi-turn conversations. Each call creates a new response object, which can be referenced in future calls via previous_response_id. This allows the API to maintain memory and context between turns, unlike stateless chat completions. You can store context server-side (store: true), chain responses, and receive output as a complete Response or as a stream of ResponsesStreamEvent objects for real-time applications.

As part of Skytells’ commitment to safety, responsible AI, and the proper use of AI technologies, the API includes built-in content safety evaluations for both input and generated outputs, helping you detect, monitor, and filter harmful content.

When to use the Responses API

Scenario	Why Use Responses API?
Building multi-turn, stateful agents	Server-side memory with `previous_response_id` enables persistent context and long-running conversations
Chaining tool calls and outputs	Each response can store structured outputs, tool traces, and be referenced in future calls
Integrating with workflows that require audit trails	Every turn is stored as a unique object, making it easy to track, audit, and analyze conversations
Streaming rich event data	Supports advanced streaming with incremental events and output types beyond plain text
Fine-grained control over context storage	Choose when to persist or discard context with the `store` parameter
Need for advanced output formats	Supports multimodal, tool, and custom output items, not just text completions

Create a response

Skytells SDK

import Skytells from 'skytells';

const client = Skytells(process.env.SKYTELLS_API_KEY);

const response = await client.responses.create({
model: 'skytells-3',
input: 'Explain quantum entanglement in simple terms.',
store: true,
});

console.log(response.output[0].content[0].text);
// "Quantum entanglement is a phenomenon where..."

// Continue the conversation
const follow_up = await client.responses.create({
model: 'skytells-3',
input: 'Can you give me an analogy?',
previous_response_id: response.id,
store: true,
});

console.log(follow_up.output[0].content[0].text);

Returns a Response object, or a stream of ResponsesStreamEvent objects when stream: true.

Responses API FAQs

Which models can I use with the Responses API?

The Responses API supports all Skytells's language models, including general-purpose and specialized models. See the Model Catalog for the latest list. Specify the desired model in the model parameter of your request.

Can I use the Responses API for real-time applications?

Yes! By setting "stream": true in your request, you can receive the model's response as a stream of SSE chunks, allowing you to display the response incrementally as it's generated. The Skytells SDK provides an async iterable for easy streaming.

How does the Responses API handle content safety?

The Responses API includes built-in content safety evaluations for both input and generated outputs. Each response includes content_filter_results and related fields, which categorize and rate the severity of any potentially harmful content. You can use this information to implement your own filtering logic or to monitor the safety of interactions. For more details, see the Safety Types and Responsible AI documentation.

Is the Responses API compatible with the OpenAI SDK?

Not directly. The OpenAI SDK does not natively support the Responses API. For OpenAI-compatible access, use the Chat API instead. The Skytells SDK provides full support for the Responses API.

How to debug or monitor Responses API usage?

Skytells provides detailed response objects that include usage information (token counts), content safety evaluations, and error messages when applicable. You can log these details in your application for monitoring and debugging purposes. Additionally, Skytells' dashboard offers analytics and insights into your API usage, including breakdowns of which models you're using, how many tokens are being processed, and any safety filter triggers. For more information on the response schema and safety features, see the Response Objects reference and the Safety Types documentation.

How is this guide?

Responses API

Responses API

How it works

When to use the Responses API

Create a Response

Response Objects

Quick Example

Create a response

Responses API FAQs

Which models can I use with the Responses API?

Can I use the Responses API for real-time applications?

How does the Responses API handle content safety?

Is the Responses API compatible with the OpenAI SDK?

How to debug or monitor Responses API usage?

On this page