Pricing built for growth

Production inference that won't break your product or your bank.

Start building

Talk to an engineer

Trusted by top engineering and machine learning teams

Basic

Deploy custom, fine-tuned, and open-source models

Included in Basic:

Dedicated deployments

Model APIs

Fast cold starts

SOC 2 Type II and HIPAA compliant

Email and in-app chat support

Deployment options

Baseten

$0 per month, pay as you go

Get started

Pro

Unlimited autoscaling and priority compute access

Everything in Basic plus:

Priority access to high-demand GPUs

Dedicated compute

Higher Model API rate limits

Hands-on engineering expertise

Dedicated support on Slack and Zoom

Deployment options

Baseten

Volume discounts available

Get a quote

Enterprise

Full control in your cloud and ours

Everything in Pro plus:

Custom SLAs

Training (Beta)

Self-host deployments

On-demand flex compute

Use existing cloud commitments

Full control over data residency

Advanced security and compliance

Custom global regions

Deployment options

Baseten Your VPC Hybrid

Volume discounts available

Get a quote

Pricing

Best-in-class model performance, effortless autoscaling, and blazing fast cold starts mean you get the most out of each GPU, saving money along the way.

Model APIs

Instant access to pre-optimized models running on the Baseten Inference Stack.

Price per

1M tokens

Model

Input

Output

DeepSeek-R1

$2.55

$5.95

Try Model API

DeepSeek-V3

$0.77

Try Model API

Llama 4 Maverick

$0.19

$0.72

Try Model API

Llama 4 Scout

$0.13

$0.50

Try Model API

Dedicated Deployments

Only pay for the compute you use, down to the minute.

Price per

GPU Instances

Price

16 GiB VM, 4 vCPUs, 16 GiB RAM

$0.01052

Deploy

24 GiB VRAM, 4 vCPUs, 16 GiB RAM

$0.01414

Deploy

A10G

24 GiB VM, 4 vCPUs, 16 GiB RAM

$0.02012

Deploy

A100

80 GiB VRAM, 12 vCPUs, 144 GiB RAM

$0.06667

Deploy

H100 MIG

40 GiB VRAM, 13 vCPUs, 117 GiB RAM

$0.0625

Deploy

H100

80 GiB VRAM, 26 vCPUs, 234 GiB RAM

$0.10833

Deploy

B200

180 GiB VRAM, 28 vCPUs, 384 GiB RAM

$0.16633

Deploy

CPU Instances

Price

1x2

1 vCPU, 2 GiB RAM

$0.00058

Deploy

1x4

1 vCPU, 4 GiB RAM

$0.00086

Deploy

2x8

2 vCPUs, 8 GiB RAM

$0.00173

Deploy

4x16

4 vCPUs, 16 GiB RAM

$0.00346

Deploy

8x32

8 vCPUs, 32 GiB RAM

$0.00691

Deploy

16x64

16 vCPUs, 64 GiB RAM

$0.01382

Deploy

Talk to Sales about compute in other countries and regions.

Common questions

Explore Baseten today

Start deploying

Talk to an engineer

Pricing built for growth

Pricing

Common questions

Which models can I run on Baseten?

Which GPUs are available on Baseten?!

Do you offer free credits to get started?

Is Baseten secure?

Do I pay for idle time on Baseten?

What level of customer support do you offer?

Do you offer discounts on compute?

Can I deploy Baseten on my own infrastructure?

Explore Baseten today