Plans and pricing

Pay for what you use

Only pay for the time your model is actively deploying, scaling up or down, or making predictions. Further calibrate autoscaling settings to save even more on compute resources.

Compute costs

Start with $30 of free credit!
starting at $0.00096 /min
Tesla T4
starting at $0.01753 /min
Nvidia A10G
starting at $0.03353 /min
Tesla V100
starting at $0.10200 /min
View all compute pricing →

Choose the plan that's right for you


For starting up and scaling up

plus compute
per month
Included in Startup
Unlimited models and versions
Flexible model resource options
Model performance metrics
Draft models
Up to 5 users
And more
Get started with $30 of free credit

For custom engagements and support at scale

Get a custom quote
Everything in Startup, plus:
Volume discounts on model resources
Multi-tenant and data segregation
Self-hosted Baseten
Data privacy agreements
Custom proof-of-concept
Unlimited users
And more
Contact sales

Trusted by top data science and machine learning teams

Commonly asked questions

What does Baseten do?

Baseten is the simplest way to put a model behind an API or webapp hosted on fully managed, scalable infrastructure.

What GPUs are available?

You have control over what GPUs your models use. We currently offer NVIDIA T4, NVIDIA A10, and NVIDIA V100 GPUs. Contact us to learn more.

What is the latency?

Our servers are located on the U.S. west coast in AWS data centers. More regions are being added to reduce global latency.

How does billing work?

We bill for the time your model is active, by the minute. You have control over when each model is active, resource instance type, and autoscaling settings. When you first start using model resources, you’ll be asked to add a credit card to your account. At the end of each month, we’ll charge the card on file for your total usage throughout that month.

Can I run my models on my own infrastructure?

Yes. We offer on-premise deployments on our Enterprise plan. Contact us to learn more.

Is Baseten secure? Where are my models hosted?

Data and workloads are hosted in AWS. All user workloads are run in isolated environments. We have isolation at hardware & network levels.

Do you offer yearly discounts?

Yes, buy yearly to get 10% off. Contact us for yearly plans.

Do you offer education and non-profit discounts?

Yes, we are happy to support ML efforts for education and non-profit organizations. Contact us to learn more.