Deployment options

Baseten Hybrid: control and flexibility in your cloud and ours

Get the performance of a managed service in your own VPC, with seamless overflow to Baseten Cloud.

Talk to an engineer Try Baseten Cloud

‌

High-performance inference with seamless overflow

Flex your cloud

Maintain SLAs during traffic spikes, avoid vendor lock-in, and leverage existing cloud credits with our effortless multi-cloud routing.

Cut latency

With rapid cold starts and tailored model performance, our customers achieve lower overall latency and faster time to first token.

Designed for compliance

Keep sensitive workloads in your VPC, and lean on the SOC 2 Type II, HIPAA, and GDPR compliance of Baseten Cloud.

Choosing Baseten Hybrid, Self-hosted, or Cloud

	Baseten Hybrid	Baseten Self-hosted	Baseten Cloud
Feature	Pricing	Pricing	Pricing
Data control	Full data control in your VPC; managed data security on Baseten Cloud	Full data control	Managed data security; we never store model inputs or outputs
Data residency requirements	Region-locked data and deployments with multi-region support	Region-locked data and deployments	Multi-region support with global deployment options
Compute capacity	Leverage existing resources or Baseten compute for overflow	Leverage existing in-house resources	Leverage on-demand compute with SOTA GPUs
Cost efficiency	Use in-house compute whenever available for optimized costs	Utilize dedicated resources without extra spend on hardware	Gain cost-effective, on-demand compute
Integration with internal systems	Custom or out-of-the-box integrations	Custom or out-of-the-box integrations	Easy integration via Baseten's ecosystem
Performance optimization	SOTA on-chip model performance and low network latency	SOTA on-chip model performance and low network latency	SOTA on-chip model performance and low network latency
Scalability	High, tailored scalability with flex capacity on Baseten Cloud	High, tailored scalability	High, flexible scaling options
Security and compliance	Adhere to custom policies and our SOC 2 Type II, HIPAA, and GDPR compliance	Adhere to custom organizational policies	SOC 2 Type II certified, HIPAA compliant, and GDPR compliant by default
Support and maintenance	Comprehensive support and managed services	Comprehensive support and managed services	Comprehensive support and managed services
Utilization of existing cloud commits	Use credits or commits	Use credits or commits	Spend down existing cloud commits

Feature

Data control

Full data control in your VPC; managed data security on Baseten Cloud

Data residency requirements

Region-locked data and deployments with multi-region support

Compute capacity

Leverage existing resources or Baseten compute for overflow

Cost efficiency

Use in-house compute whenever available for optimized costs

Integration with internal systems

Custom or out-of-the-box integrations

Performance optimization

SOTA on-chip model performance and low network latency

Scalability

High, tailored scalability with flex capacity on Baseten Cloud

Security and compliance

Adhere to custom policies and our SOC 2 Type II, HIPAA, and GDPR compliance

Support and maintenance

Comprehensive support and managed services

Utilization of existing cloud commits

Use credits or commits

Learn more

Get the best of Self-hosted and Cloud deployments

Flex on-demand

Utilize internal resources whenever they’re available, seamlessly transition to Baseten Cloud whenever necessary.

Control data residency

Host in your VPC, or use a dedicated deployment on Baseten Cloud. We never store model inputs or outputs.

Auto-scale to peak demand

Future-proof your product against traffic bursts with our optimized autoscaling and blazing-fast cold starts.

Meet compliance

Store data where you need it, and lean on the SOC 2 Type II, HIPAA, and GDPR compliance of Baseten Cloud.

Optimize costs

Fully use existing hardware or cloud commits, and take advantage of Baseten’s on-demand pricing for overflow.

Ship faster

Save time with out-of-the-box performance optimizations and engineers dedicated to hitting your performance targets.

Category

Flex on-demand

Utilize internal resources whenever they’re available, seamlessly transition to Baseten Cloud whenever necessary.

Utilize internal resources whenever they’re available, seamlessly transition to Baseten Cloud whenever necessary.

Baseten supports billions of custom, fine-tuned LLM calls per week from OpenEvidence, serving high-stakes medical information to healthcare providers in every major healthcare facility in the country. If you see a doctor today, chances are that they are leveraging OpenEvidence for trustworthy, up-to-date medical information at their fingertips. Baseten's tireless dedication to reliability and deep support at scale has proven up to the task of supporting this at times literally life-or-death mission.
Speed is critical for Gamma. We're a PLG company: the faster we can deliver something great to our users, the happier they are with the product. That's why we partner with Baseten to serve our open-source image generation models. We generate millions of images a day on Baseten for our 70+ million users with ultra-low latency and high throughput.
Jon Noronha
Co-founder and CPO, Gamma