Pricing built for growth
Production inference that won't break your product or your bank.
Deploy custom, fine-tuned, and open-source models
Included in Basic:
Deployment options
$0 per month, pay as you go
Unlimited autoscaling and priority compute access
Everything in Basic plus:
Deployment options
Volume discounts available
Full control in your cloud and ours
Everything in Pro plus:
Model APIs
Instant access to pre-optimized models running on the Baseten Inference Stack.
Price per
1M tokens
Model
Input
Output
Dedicated Deployments
Only pay for the compute you use, down to the minute.
Price per
GPU Instances
Price
T4
16 GiB VM, 4 vCPUs, 16 GiB RAM
$0.01052
L4
24 GiB VRAM, 4 vCPUs, 16 GiB RAM
$0.01414
A10G
24 GiB VM, 4 vCPUs, 16 GiB RAM
$0.02012
A100
80 GiB VRAM, 12 vCPUs, 144 GiB RAM
$0.06667
H100 MIG
40 GiB VRAM, 13 vCPUs, 117 GiB RAM
$0.0625
H100
80 GiB VRAM, 26 vCPUs, 234 GiB RAM
$0.10833
B200
180 GiB VRAM, 28 vCPUs, 384 GiB RAM
$0.16633