Embeddings API
Reference overview for the Embeddings sub-API — POST /v1/embeddings.
Embeddings API
The Embeddings API is part of the Inference APIs. It converts text into dense vector representations (float arrays) that capture semantic meaning. Use embeddings for semantic search, nearest-neighbor retrieval, clustering, classification, and retrieval-augmented generation (RAG). Pass a single string or a batch of strings; the API returns one vector per input, synchronously in a single response.
It follows the same request and response shape as the OpenAI Embeddings API, so you can point the OpenAI SDK at Skytells with a baseURL override. Pair embeddings with the Chat API or Responses API when you need to generate answers from retrieved context. For safety features on generative APIs, see Safety and Responsible AI.
- Endpoint:
POST /v1/embeddings - SDK access:
client.embeddings.create(params) - Input: one string or
string[]— multiple inputs are embedded in one request (batching) - OpenAI-compatible: yes — same schema; use
baseURL: 'https://api.skytells.ai/v1'with the OpenAI client
How it works
You send a model identifier and an input (string or list of strings). The service returns an EmbeddingResponse whose data array contains one Embedding per input, each with a vector of floats. You can request "encoding_format": "float" (default) or "base64" for compact transfer, and use dimensions on supported models to truncate vectors. Compare vectors with cosine similarity or store them in a vector database for search.
Embeddings turn text into coordinates in a semantic space: texts with similar meaning tend to land closer together, which powers search and retrieval without keyword matching alone.
When to use Embeddings API vs Chat or Responses API
| Use Case | Recommendation |
|---|---|
| Semantic search, duplicate detection, clustering | Embeddings API — index and compare vectors |
| RAG: find documents, then generate an answer | Embeddings for retrieval + Chat API or Responses API for generation |
| Conversation, Q&A, tool use | Chat API or Responses API — not embeddings alone |
| Stateless chat with OpenAI SDKs | Chat API — embeddings are for vectors, not dialogue |
Create Embeddings
Full endpoint reference — every request parameter, response shape, encoding options, and multi-client code examples.
Embedding Objects
Named type definitions: EmbeddingResponse, Embedding, EmbeddingUsage, and related fields.
Quick Example
Create embeddings
import Skytells from 'skytells';
const client = Skytells(process.env.SKYTELLS_API_KEY);
const result = await client.embeddings.create({
model: 'skytells-embed-3-large',
input: 'The quick brown fox jumps over the lazy dog.',
});
const vector = result.data[0].embedding;
// Float32Array of 3072 dimensions
console.log(vector.length); // 3072
console.log(vector[0]); // e.g. 0.0023064255
// For semantic similarity: compute cosine similarity between two vectors
function cosineSimilarity(a: number[], b: number[]) {
const dot = a.reduce((sum, v, i) => sum + v * b[i], 0);
const magA = Math.sqrt(a.reduce((sum, v) => sum + v * v, 0));
const magB = Math.sqrt(b.reduce((sum, v) => sum + v * v, 0));
return dot / (magA * magB);
}Returns an EmbeddingResponse containing a list of Embedding items and EmbeddingUsage token counts.
Embeddings API FAQs
Which models can I use with the Embeddings API?
The Embeddings API supports Skytells embedding models (for example
skytells-embed-3-large) and partner models where offered. See the Model Catalog for dimensions, limits, and availability. Set themodelparameter on each request.
Can I embed multiple texts in one request?
Yes. Pass
inputas an array of strings to embed several texts in a single call. The responsedataarray aligns with your inputs byindex.
What is the difference between float and base64 encoding?
"encoding_format": "float"(default) returns each embedding as a JSON array of numbers."base64"returns the same bytes encoded as Base64, which can reduce payload size for large batches. Decode Base64 on the client to recover the float buffer if needed.
Is the Embeddings API compatible with the OpenAI Embeddings API?
Yes. The endpoint and fields match OpenAI’s embeddings shape. Use the OpenAI SDK with
baseURLset tohttps://api.skytells.ai/v1and your Skytells API key. Behavior and model IDs follow Skytells’ catalog.
Which Skytells API handles requests to the Embeddings API?
The Embeddings API is part of the Skytells Inference APIs, alongside the Chat API, Responses API, and related inference endpoints.
How do I debug or monitor Embeddings API usage?
Responses include
usagewith token counts you can log. Use your application logs for per-request debugging and the Skytells dashboard for aggregate usage and model breakdowns. For field-level detail, see Embedding Objects.
How is this guide?
Responses ObjectsREF
Type definitions for every object the Responses API emits — Response, OutputItem, ContentPart, ContentFilter, ResponsesStreamEvent, ResponsesUsage.
Create EmbeddingsPOST
POST /v1/embeddings — full parameter reference with code examples for the Skytells SDK, REST, and OpenAI client.