Customer stories

We're creating a platform for progressive AI companies to build their products in the fastest, most performant infrastructure available.

Trusted by top engineering and machine learning teams
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo

With Baseten, we gained a lot of control over our entire inference pipeline and worked with Baseten’s team to optimize each step.

Sahaj Garg logo

Sahaj Garg,

Co-Founder and CTO

With Baseten, we gained a lot of control over our entire inference pipeline and worked with Baseten’s team to optimize each step.

Baseten has saved us countless hours of experimentation and eliminated the stress of working about inference reliability. Beyond the phenomenal product experience, Baseten has far and away the best people in the industry. It is incredibly rare to find a team that approaches your problems with the same care and dedication as you would yourself. People like Abu, Utsav, and Tuhin make research worth doing.

Allan Bishop logoAllan Bishop, Head of Engineering
Allan Bishop logo

Allan Bishop,

Head of Engineering

Baseten has saved us countless hours of experimentation and eliminated the stress of working about inference reliability. Beyond the phenomenal product experience, Baseten has far and away the best people in the industry. It is incredibly rare to find a team that approaches your problems with the same care and dedication as you would yourself. People like Abu, Utsav, and Tuhin make research worth doing.

Customer Stories

Latent Health logo

Latent delivers pharmaceutical search with 99.999% uptime on Baseten

Latent Health uses Baseten to power fast, reliable clinical AI.

Logo

Praktika delivers ultra-low-latency transcription for global language education with Baseten

With Baseten, Praktika delivers <300 milliseconds latency empowering language learners worldwide with a seamless conversational and learning experience.

Logo

Zed Industries serves 2x faster code completions with the Baseten Inference Stack

By partnering with Baseten, Zed achieved 45% lower latency, 3.6x higher throughput, and 100% uptime for their Edit Prediction feature.

Logo

Wispr Flow creates effortless voice dictation with Llama on Baseten

Wispr Flow runs fine-tuned Llama models with Baseten and AWS to provide seamless dictation across every application.

Logo

Rime serves speech synthesis API with stellar uptime using Baseten

Rime AI chose Baseten to serve its custom speech synthesis generative AI model and achieved state-of-the-art p99 latencies with 100% uptime in 2024

Bland AI logo

Bland AI breaks latency barriers with record-setting speed using Baseten

Bland AI leveraged Baseten’s state-of-the-art ML infrastructure to achieve real-time, seamless voice interactions at scale.

Logo

Baseten powers real-time translation tool toby to Product Hunt podium

The founders of toby worked with Baseten to deploy an optimized Whisper model on autoscaling hardware just one week ahead of their Product Hunt launch and had a top-three finish with zero downtime.

Logo

Custom medical and financial LLMs from Writer see 60% higher tokens per second with Baseten

Writer, the leading full-stack generative AI platform, launched new industry-specific LLMs for medicine and finance. Using TensorRT-LLM on Baseten, they increased their tokens per second by 60%.

Logo

Patreon saves nearly $600k/year in ML resources with Baseten

With Baseten, Patreon deployed and scaled the open-source foundation model Whisper at record speed without hiring an in-house ML infra team.