Pricing built for growth
Production inference that won't break your product or your bank.
Deploy custom, fine-tuned, and open-source models
Included in Basic:
Deployment options
$0 per month, pay as you go
Unlimited autoscaling and priority compute access
Everything in Basic plus:
Deployment options
Volume discounts available
Full control in your cloud and ours
Everything in Pro plus:
Model APIs
Instant access to pre-optimized models running on the Baseten Inference Stack.
Price per
1M tokens
Model
Input
Output
Dedicated Deployments
Only pay for the compute you use, down to the minute.
Price per
GPU Instances
Price
T4
16 GiB VM, 4 vCPUs, 16 GiB RAM
$0.01052
L4
24 GiB VRAM, 4 vCPUs, 16 GiB RAM
$0.01414
A10G
24 GiB VM, 4 vCPUs, 16 GiB RAM
$0.02012
A100
80 GiB VRAM, 12 vCPUs, 144 GiB RAM
$0.06667
H100 MIG
40 GiB VRAM, 13 vCPUs, 117 GiB RAM
$0.0625
H100
80 GiB VRAM, 26 vCPUs, 234 GiB RAM
$0.10833
B200
180 GiB VRAM, 28 vCPUs, 384 GiB RAM
$0.16633
Training
Get 20% of Training spend back as credits for Dedicated Deployments.
Price per
GPU Instances
Price
T4
16 GiB VM, 4 vCPUs, 16 GiB RAM
$0.01052
L4
24 GiB VRAM, 4 vCPUs, 16 GiB RAM
$0.01414
A10G
24 GiB VM, 4 vCPUs, 16 GiB RAM
$0.02012
A100
80 GiB VRAM, 12 vCPUs, 144 GiB RAM
$0.06667
H100 MIG
40 GiB VRAM, 13 vCPUs, 117 GiB RAM
$0.0625
H100
80 GiB VRAM, 26 vCPUs, 234 GiB RAM
$0.10833
B200
180 GiB VRAM, 28 vCPUs, 384 GiB RAM
$0.16633
Common questions
Contact usYou can deploy open source and custom models on Baseten. Start with an off-the-shelf model from our model library. Or deploy any model using Truss, our open source standard for packaging and serving models built in any framework.
You have control over what GPUs your models use. See our instance type reference for a full list of the GPUs currently available on Baseten. Reach out to us to request additional GPU types.
Yes, new Baseten accounts come with credits so you can get to know the UI and experiment with deployments for free.
Yes, Baseten is SOC 2 Type II certified and HIPAA compliant. You can read more about our SOC 2 Type II certification here. And you can read more about our HIPAA compliance here.
No, you do not pay for idle time – you only pay for the time your model is using compute on Baseten. This includes the time your model is actively deploying, scaling up or down, or making predictions. And you have full control over how your model scales up or down.
Customer support levels vary by plan. We offer email, in-app chat, Slack, and Zoom support. We also offer dedicated forward-deployed engineering support. Reach out to our team to figure out a customer support level that works for your needs.
Yes, discounts on compute can be negotiated as part of our Pro and Enterprise plans. Reach out to our team to learn more.
Yes, you can self-host Baseten in order to manage security and use your own cloud commitments. Talk to our engineers to learn more.





























