Plans and pricing
Pay for what you use. Only pay for the time your model is actively deploying, scaling up or down, or making predictions. Further calibrate autoscaling settings to save even more on compute resources.
Get started for freeSelect an instance type
T4x4x16
1 T4 GPU, 16 GiB VM, 4 vCPUs, 16 GiB
$0.01052/min
T4x16x64
1 T4 GPU, 16 GiB VM, 16 vCPUs, 64 GiB
$0.02408/min
A10Gx4x16
1 A10s GPU, 24 GiB VM, 4 vCPUs, 16 GiB
$0.02012/min
A10Gx16x64
1 A10 GPU, 24 GiB VM, 16 vCPUs, 64 GiB
$0.03248/min
A10G:2x24x96
2 A10 GPU, 48 GiB VM, 24 vCPUs, 96 GiB
$0.05672/min
A100x12x144
1 A100 GPU, 80 GiB VM, 12 vCPUs, 144 GiB
$0.10240/min
1x2
1 vCPU, 2GiB RAM
$0.00058/min
1x4
1 vCPU, 4GiB RAM
$0.0008/min
2x8
2 vCPUs, 8GiB RAM
$0.00173/min
4x16
4 vCPUs, 16GiB RAM
$0.00346/min
8x32
8 vCPUs, 32GiB RAM
$0.00691/min
16x64
16 vCPUs, 64GiB RAM
$0.01382/min
Choose the plan that's right for you
Startup
$0 per month, just pay for compute
Included in Startup:
Unlimited models and versions
All Baseten features enabled
HIPAA and SOC II compliance
Up to 5 workspace users
Pro
Get a custom quote
Everything in Startup plus:
Discounted model resources
Data privacy agreements
Dedicated engineering support
Unlimited workspace users
Self Hosted
Get a custom quote
Everything in Pro plus:
Self-hosted models on your cloud
Multi-stage proof of concept
Live engineering support
Trusted by top engineering and machine learning teams
Commonly asked questions
- Baseten is the simplest way to put a model behind an API or webapp hosted on fully managed, scalable infrastructure.
- You have control over what GPUs your models use. We currently offer NVIDIA T4, A10, V100, and A100 GPUs available. Contact us to learn more or to request additional GPU types.
- Our servers are located on the U.S. west coast in AWS data centers. More regions are being added to reduce global latency.
- We bill for the time your model is active, by the minute. You have control over when each model is active, resource instance type, and autoscaling settings. After you use up your free credits, you’ll be asked to add a credit card to your account. At the end of each month, we’ll charge the card on file for your total usage throughout that month.
- Yes. We offer on-premise deployments on our Enterprise plan. Contact us to learn more.
- Data and workloads are hosted in AWS. All user workloads are run in isolated environments. We have isolation at hardware & network levels.
- Yes, we offer significant volume discounts on model resources. Reach out to us at hi@baseten.co to find out more.
- Yes, we are happy to support ML efforts for education and non-profit organizations. Contact us at hi@baseten.co to learn more.