Conversations API

Reference overview for the Chat Completions sub-API — POST /v1/chat/completions.

Chat API / Conversations API

Chat API, also known as the Conversations API, It's a part of the Inference APIs, which provides a turn-based conversation model. It accepts a history of messages and returns the model's next reply — synchronously or as a stream of SSE chunks with built-in safety evaluations. The Chat API is ideal for traditional conversational use cases where you want to maintain a message history and get the model's response in a chat format. It's fully compatible with the OpenAI Chat Completions API, with Skytells-specific additions for content safety filtering and jailbreak detection. For a modern alternative to chat completions with persistent memory and richer streaming events, see the Responses API. and for safety features, see Safety and Responsible AI.

Endpoint: POST /v1/chat/completions
SDK access: client.chat.completions.create(params)
Streaming: send "stream": true — returns AsyncIterable<ChatCompletionChunk> (SDK) or SSE (REST)
OpenAI-compatible: yes — same schema, augmented with Skytells content_filter_results and prompt_filter_results

How it works

The Conversation API (Also known as Chat API) is designed for turn-based conversations. You send a list of messages representing the conversation history, and the model generates the next reply. Each message has a role (system, user, or assistant) and content (text or multimodal). The API processes these messages in order and produces a response that continues the conversation. You can receive the response as a complete ChatCompletion as JSON object or as a stream of incremental updates (SSE) for real-time applications.

As part of Skytells’ commitment to safety, responsible AI, and the proper use of AI technologies, the API includes built-in content safety evaluations for both input prompts and generated completions, helping you detect, monitor, and filter harmful content.

When to use Chat API vs Responses API

Use Case	Recommendation
Simple Q&A, stateless tasks	Chat API — send message history each turn
Multi-turn agents, memory across calls	Responses API — use `previous_response_id`
Tool-using agents with persistent context	Responses API — `store: true` required
Low-latency, token-efficient	Chat API (no server storage overhead)

Create a chat completion

Skytells SDK

import Skytells from 'skytells';

const client = Skytells(process.env.SKYTELLS_API_KEY);

const completion = await client.chat.completions.create({
  model: 'deepbrain-router',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of France?' },
  ],
});

console.log(completion.choices[0].message.content); // "Paris"
console.log(completion.usage.total_tokens);

Returns a ChatCompletion object, or a stream of ChatCompletionChunk objects when stream: true.

Conversation API FAQs

Which models can I use with the Chat API?

The Chat API supports all Skytells's language models along with models by our partners, including general-purpose models like deepbrain-router and gpt-5-nano, as well as specialized models for tasks like coding, reasoning, or multimodal input. Please refer to the Model Catalog for the latest list of available models and their capabilities. When creating a chat completion, specify the desired model in the model parameter of your request.

Can I use the Chat API for real-time applications?

Yes! By setting "stream": true in your request, you can receive the model's response as a stream of SSE chunks, allowing you to display the response incrementally as it's generated. For even faster responses (under 2 seconds), consider using fast models like deepbrain-router or gpt-5-nano.

How does the Chat API handle content safety?

The Chat API includes built-in content safety evaluations for both input prompts and generated completions. Each response includes content_filter_results for the generated completion and prompt_filter_results for the input messages, which categorize and rate the severity of any potentially harmful content. You can use this information to implement your own filtering logic or to monitor the safety of interactions. For more details, see the Safety Types and Responsible AI documentation.

Is the Chat API compatible with the OpenAI Chat Completions API?

Yes! The Chat API follows the same schema as OpenAI's Chat Completions API, with Skytells-specific additions for content safety filtering. You can use the OpenAI SDK with Skytells by simply changing the baseURL to https://api.skytells.ai/v1. All standard parameters and response formats are supported, along with Skytells' enhanced safety features which can only accessed through Skytells-specific fields in the response, these fields can be safely ignored when using the OpenAI SDK, But you can take advantage of them with Skytells SDKs. For more details on the schema and safety features, see the Chat Objects reference.

Which Skytells API handles requests to the Chat API?

The Chat API is part of the Skytells Inference APIs, which also include the Responses API and the Embeddings API.

How to debug or monitor Chat API usage?

Skytells provides detailed response objects that include usage information (token counts), content safety evaluations, and error messages when applicable. You can log these details in your application for monitoring and debugging purposes. Additionally, Skytells' dashboard offers analytics and insights into your API usage, including breakdowns of which models you're using, how many tokens are being processed, and any safety filter triggers. For more information on the response schema and safety features, see the Chat Objects reference and the Safety Types documentation.

How is this guide?