Plans and pricing 

Pay for what you use. Only pay for the time your model is actively deploying, scaling up or down, or making predictions. Further calibrate autoscaling settings to save even more on compute resources. See all available instance types.

Get started for free
Select an instance type
T4x4x16
1 T4 GPU, 16 GiB VM, 4 vCPUs, 16 GiB
$0.01052/min
L4x4x16
1 L4 GPU, 24 GiB VRAM, 4 vCPUs, 16 GiB
$0.01414/min
A10Gx4x16
1 A10s GPU, 24 GiB VM, 4 vCPUs, 16 GiB
$0.02012/min
A100x12x144
1 A100 GPU, 80 GiB VRAM, 12 vCPUs, 144 GiB
$0.10240/min
H100x26x234
1 H100 GPU, 80 GiB VRAM, 26 vCPUs, 234 GiB
$0.16640/min
1x2
1 vCPU, 2GiB RAM
$0.00058/min
1x4
1 vCPU, 4GiB RAM
$0.0008/min
2x8
2 vCPUs, 8GiB RAM
$0.00173/min
4x16
4 vCPUs, 16GiB RAM
$0.00346/min
8x32
8 vCPUs, 32GiB RAM
$0.00691/min
16x64
16 vCPUs, 64GiB RAM
$0.01382/min

Choose the plan that's right for you

Startup

$0 per month, just pay for compute

Included in Startup:

Unlimited models and versions
All Baseten features enabled
HIPAA and SOC II compliance
Up to 5 workspace users
Pro

Get a custom quote

Everything in Startup plus:

Discounted model resources
Data privacy agreements
Dedicated engineering support
Unlimited workspace users
Self Hosted

Get a custom quote

Everything in Pro plus:

Self-hosted models on your cloud
Multi-stage proof of concept
Live engineering support

Commonly asked questions

  • Baseten is the simplest way to put a model behind an API or webapp hosted on fully managed, scalable infrastructure.
  • You have control over what GPUs your models use. We currently offer NVIDIA T4, A10, V100, and A100 GPUs available. Contact us to learn more or to request additional GPU types.
  • Our servers are located on the U.S. west coast in AWS data centers. More regions are being added to reduce global latency.
  • We bill for the time your model is active, by the minute. You have control over when each model is active, resource instance type, and autoscaling settings. After you use up your free credits, you’ll be asked to add a credit card to your account. At the end of each month, we’ll charge the card on file for your total usage throughout that month.
  • Yes. We offer on-premise deployments on our Enterprise plan. Contact us to learn more.
  • Data and workloads are hosted in AWS. All user workloads are run in isolated environments. We have isolation at hardware & network levels.
  • Yes, we offer significant volume discounts on model resources. Reach out to us at hi@baseten.co to find out more.
  • Yes, we are happy to support ML efforts for education and non-profit organizations. Contact us at hi@baseten.co to learn more.