Pricing built for growth
Production inference that won't break your product or your bank.
Basic
Deploy custom, fine-tuned, and open-source models
Included in Basic:
Dedicated deployments
Model APIs
Fast cold starts
SOC 2 Type II and HIPAA compliant
Email and in-app chat support
Deployment options
$0 per month, pay as you go
Pro
Unlimited autoscaling and priority compute access
Everything in Basic plus:
Priority access to high-demand GPUs
Dedicated compute
Higher Model API rate limits
Hands-on engineering expertise
Dedicated support on Slack and Zoom
Deployment options
Volume discounts available
Enterprise
Full control in your cloud and ours
Everything in Pro plus:
Custom SLAs
Training (Beta)
Self-host deployments
On-demand flex compute
Use existing cloud commitments
Full control over data residency
Advanced security and compliance
Custom global regions
Model APIs
Instant access to pre-optimized models running on the Baseten Inference Stack.
Price per
1M tokens
Model
Input
Output
Dedicated Deployments
Only pay for the compute you use, down to the minute.
Price per
GPU Instances
Price
Talk to Sales about compute in other countries and regions.