Enterprise Deployments

Unlike standard endpoints, Enterprise customers get fully dedicated endpoints secured and managed by Skytells—private via Skytells Private Network or internet per org settings. Contact Sales for a custom deployment or endpoint.

What this is

Most customers use the shared Skytells API: one public surface, shared capacity.

Enterprise inference means Skytells runs your models—or models we set up for you—on infrastructure reserved for your organization. You still use the normal Inference APIs (Chat, Responses, Embeddings). The difference is where the request runs and who else shares it: your traffic goes to your environment, not the general pool.

For CPU and GPU tiers, billing, and hardware IDs used with dedicated inference, see Enterprise compute.

Standard vs Enterprise

Unlike the standard inference endpoints—the shared surface used by most customers—Enterprise customers receive fully dedicated endpoints: inference URLs and routing reserved for your organization, not shared with other tenants.

Those dedicated endpoints are secured and managed by Skytells. Skytells operates the stack, applies platform security controls, and keeps the service current; you integrate against the endpoint, not raw infrastructure.

Network access is configured per user or organization settings:

Private — The endpoint is reachable only on Skytells Private Network. Traffic does not traverse the public internet, which supports stricter isolation and compliance requirements.
Internet — The same dedicated endpoint can be exposed on the public internet when your policy and contract allow it, so client applications can call it from outside Skytells’s private network.

How private connectivity differs from the internet

Across major cloud private link designs (for example AWS PrivateLink, Azure Private Link), the pattern is the same in spirit: a private endpoint appears in your network as a private IP (an interface your workloads call). Traffic to the service rides the provider’s private network—not the public internet. At the provider edge, flows are often source-mapped through a NAT or managed address pool so the backend service sees stable, provider-scoped addresses while your internal IPs stay inside your network. Skytells Private Network follows this idea: no public internet hop for the private path; routing and address mapping are operated by Skytells on your behalf.

Private path — from your app to inference (conceptual):

Internet path — same dedicated inference surface, reachable over the public internet when your org enables it:

	Shared (default)	Enterprise
Endpoint model	Shared inference surface	Dedicated endpoints for your org
Operations	Skytells-managed shared pool	Secured and managed by Skytells — dedicated to you
Network	Public internet	Skytells Private Network only, internet, or both — per org / user settings
Base URL	Standard `api.skytells.ai` surface	Dedicated base URL for your deployment

Models and API surface

Your dedicated base URL exposes the same Inference API as the shared platform—Chat, Responses, and Embeddings—with the same paths, schemas, streaming behavior, and error model; only the host (and thus base_url in clients) changes. For operations, references, and SDK usage, see Inference and Authentication. Runtime hardware is covered under Enterprise compute.

In addition to the supported models in our catalog that can be deployed to run on Enterprise for your organization, an enterprise or business may request a custom deployment—bring-your-own weights, fine-tunes, or other Skytells-provisioned stacks—scoped to your account. What appears on your endpoint is reflected in the Models API for keys that are authorized to use it, and is agreed in your contract.

Request a deployment or endpoint

Custom deployments and dedicated endpoints are not available through self-serve signup. To request a custom model deployment or a dedicated endpoint, contact Sales. The team will walk through your model, networking (private vs internet), hardware, and compliance requirements.

Enterprise compute — hardware tiers, GPU options, and billing context for dedicated inference
Inference API — Chat, Responses, and Embeddings overview
Models API — which models your key can see

How is this guide?

What this is

Standard vs Enterprise

How private connectivity differs from the internet

Models and API surface

Request a deployment or endpoint

Related

On this page