Plans and pricing

Pay for what you use

Only pay for the time your model is actively deploying, scaling up or down, or making predictions. Further calibrate autoscaling settings to save even more on compute resources.

Compute costs

Start with $30 of free credit!
CPU only
$0.00096 /min
T4 (16 GiB), 4 vCPU, 16 GiB
$0.01753 /min
A10 (24 GiB), 4 vCPU, 16 GiB
$0.03353 /min
V100 (16 GiB), 8 vCPU, 61 GiB
$0.10200 /min
A100 (80 GiB), 12 vCPU, 144 GiB
$0.17083 /min
View all compute options →

Pay-per-minute pricing

Calculate your monthly usage based on your model's anticipated uptime, autoscaling, and whether or not you plan to add a GPU.

Learn more about compute-based pricing →

Choose the plan that's right for you


For starting up and scaling up

plus compute
per month
Included in Startup
Unlimited models and versions
Flexible model resource options
Model performance metrics
Draft models
Autoscaling with scale to zero
Up to 5 users
And more
Get started with $30 of free credit

For custom engagements and support at scale

Get a custom quote
Everything in Startup, plus:
Volume discounts on model resources
Multi-tenant and data segregation
Self-hosted Baseten
Data privacy agreements
Custom proof-of-concept
Unlimited users
And more
Contact sales

Trusted by top engineering and machine learning teams

Commonly asked questions

What does Baseten do?

Baseten is the simplest way to put a model behind an API or webapp hosted on fully managed, scalable infrastructure.

What GPUs are available?

You have control over what GPUs your models use. We currently offer NVIDIA T4, A10, V100, and A100 GPUs available. Contact us to learn more or to request additional GPU types.

What is the latency?

Our servers are located on the U.S. west coast in AWS data centers. More regions are being added to reduce global latency.

How does billing work?

We bill for the time your model is active, by the minute. You have control over when each model is active, resource instance type, and autoscaling settings. After you use up your free credits, you’ll be asked to add a credit card to your account. At the end of each month, we’ll charge the card on file for your total usage throughout that month.

Can I run my models on my own infrastructure?

Yes. We offer on-premise deployments on our Enterprise plan. Contact us to learn more.

Is Baseten secure? Where are my models hosted?

Data and workloads are hosted in AWS. All user workloads are run in isolated environments. We have isolation at hardware & network levels.

Do you offer volume discounts?

Yes, we offer significant volume discounts on model resources. Contact our sales team (link above) or reach out to us at to find out more.

Do you offer education and non-profit discounts?

Yes, we are happy to support ML efforts for education and non-profit organizations. Contact us to learn more.