Pricing built for growth

Production inference that won't break your product or your bank.

Talk to an engineer

Trusted by top engineering and machine learning teams

Basic

Deploy custom, fine-tuned, and open-source models

Included in Basic:

Dedicated deployments

Model APIs

Fast cold starts

SOC 2 Type II and HIPAA compliant

Email and in-app chat support

Deployment options

$0 per month, pay as you go

Pro

Unlimited autoscaling and priority compute access

Everything in Basic plus:

Priority access to high-demand GPUs

Dedicated compute

Higher Model API rate limits

Hands-on engineering expertise

Dedicated support on Slack and Zoom

Deployment options

Volume discounts available

Enterprise

Full control in your cloud and ours

Everything in Pro plus:

Custom SLAs

Training (Beta)

Self-host deployments

On-demand flex compute

Use existing cloud commitments

Full control over data residency

Advanced security and compliance

Custom global regions

Deployment options

Baseten Your VPC Hybrid

Volume discounts available

Pricing

Best-in-class model performance, effortless autoscaling, and blazing fast cold starts mean you get the most out of each GPU, saving money along the way.

Model APIs

Instant access to pre-optimized models running on the Baseten Inference Stack.

Price per

1M tokens

Model

Input

Output

DeepSeek R1 0528

$2.55

$5.95

DeepSeek V3 0324

$0.77

$0.77

Llama 4 Maverick

$0.19

$0.72

Llama 4 Scout

$0.13

$0.50

Dedicated Deployments

Only pay for the compute you use, down to the minute.

Price per

GPU Instances

Price

T4

16 GiB VM, 4 vCPUs, 16 GiB RAM

$0.01052

L4

24 GiB VRAM, 4 vCPUs, 16 GiB RAM

$0.01414

A10G

24 GiB VM, 4 vCPUs, 16 GiB RAM

$0.02012

A100

80 GiB VRAM, 12 vCPUs, 144 GiB RAM

$0.06667

H100 MIG

40 GiB VRAM, 13 vCPUs, 117 GiB RAM

$0.0625

H100

80 GiB VRAM, 26 vCPUs, 234 GiB RAM

$0.10833

B200

180 GiB VRAM, 28 vCPUs, 384 GiB RAM

$0.16633

CPU Instances

Price

1x2

1 vCPU, 2 GiB RAM

$0.00058

1x4

1 vCPU, 4 GiB RAM

$0.00086

2x8

2 vCPUs, 8 GiB RAM

$0.00173

4x16

4 vCPUs, 16 GiB RAM

$0.00346

8x32

8 vCPUs, 32 GiB RAM

$0.00691

16x64

16 vCPUs, 64 GiB RAM

$0.01382

Talk to Sales about compute in other countries and regions.

Training

Get 20% of Training spend back as credits for Dedicated Deployments.

Price per

GPU Instances

Price

T4

16 GiB VM, 4 vCPUs, 16 GiB RAM

$0.01052

L4

24 GiB VRAM, 4 vCPUs, 16 GiB RAM

$0.01414

A10G

24 GiB VM, 4 vCPUs, 16 GiB RAM

$0.02012

A100

80 GiB VRAM, 12 vCPUs, 144 GiB RAM

$0.06667

H100 MIG

40 GiB VRAM, 13 vCPUs, 117 GiB RAM

$0.0625

H100

80 GiB VRAM, 26 vCPUs, 234 GiB RAM

$0.10833

B200

180 GiB VRAM, 28 vCPUs, 384 GiB RAM

$0.16633

Talk to Sales about compute in other countries and regions.

Common questions

Explore Baseten today

Start deploying

Talk to an engineer