Running Custom Models on Skytells
Skytells CPU and GPU hardware tiers for dedicated Enterprise inference—per-second billing, standard and multi-GPU options. See Enterprise inference for endpoints and networking.
Overview
This page lists supported hardware for workloads that run on Skytells-managed compute—including Enterprise dedicated inference deployments. Dedicated inference uses the same hardware catalog as the rest of the platform; billing is per second of active processing (idle time is not charged). For list pricing, see Hardware Pricing.
If you are defining endpoints, private networking, or how to request a deployment, start with Enterprise inference.
Standard hardware
These tiers are generally available. The ID is how hardware is referenced in billing and in the Console.
| Hardware | ID | GPU | CPU | GPU RAM | RAM |
|---|---|---|---|---|---|
| CPU (Small) | cpu-small | — | 1x | — | 2 GB |
| CPU | cpu | — | 4x | — | 8 GB |
| Nvidia T4 GPU | gpu-t4 | 1x | 4x | 16 GB | 16 GB |
| Nvidia L40S GPU | gpu-l40s | 1x | 10x | 48 GB | 65 GB |
| 2x Nvidia L40S GPU | gpu-l40s-2x | 2x | 20x | 96 GB | 144 GB |
| Nvidia A100 (80 GB) GPU | gpu-a100-large | 1x | 10x | 80 GB | 144 GB |
| 2x Nvidia A100 (80 GB) GPU | gpu-a100-large-2x | 2x | 20x | 160 GB | 288 GB |
| Nvidia H100 GPU | gpu-h100 | 1x | 13x | 80 GB | 72 GB |
Additional multi-GPU hardware
Larger multi-GPU shapes are available with committed spend contracts. Contact Support for availability and custom pricing.
| Hardware | ID |
|---|---|
| 4x Nvidia A100 (80 GB) GPU | gpu-a100-large-4x |
| 8x Nvidia A100 (80 GB) GPU | gpu-a100-large-8x |
| 2x Nvidia H100 GPU | gpu-h100-2x |
| 4x Nvidia H100 GPU | gpu-h100-4x |
| 8x Nvidia H100 GPU | gpu-h100-8x |
| 4x Nvidia L40S GPU | gpu-l40s-4x |
| 8x Nvidia L40S GPU | gpu-l40s-8x |
Per-second list pricing for these tiers (where applicable) is on Hardware Pricing. Enterprise deployments may use a subset of tiers depending on model size, latency targets, and contract—Sales can align hardware to your workload.
Related
- Enterprise inference — dedicated endpoints, Skytells Private Network vs internet, requesting a deployment
- Inference API — Chat, Responses, and Embeddings overview
- Hardware Pricing — per-second rates and full pricing tables
How is this guide?
Enterprise Deployments
Unlike standard endpoints, Enterprise customers get fully dedicated endpoints secured and managed by Skytells—private via Skytells Private Network or internet per org settings. Contact Sales for a custom deployment or endpoint.
Overview
Reference overview for the Chat Completions sub-API — POST /v1/chat/completions.