Baseten Blog | Page 7

GPU guides

Choosing the right horizontal scaling setup for high-traffic models

Horizontal scaling via replicas with load balancing is an important technique for handling high traffic to an ML model.

GPU guides

How to choose the right instance size for your ML models

This post simplifies instance sizing with heuristics to choose an optimal size for your model, balancing performance and compute cost.


New in December 2022

2022's rapid ML advancements felt like a decade. Excited for 2023, we anticipate foundational models will further empower scientists and developers in ML apps.

Hacks & projects

Serving four million Riffusion requests in two days

Riffusion is a fine-tuned version of Stable Diffusion. Baseten served Riffusion over four million times in a couple of days, serving top-of-hacker-news traffic.


Accelerating model deployment: 100X faster dev loops with development deployments

Baseten's development deployments speed up ML model dev loops, replacing slow workflows with a live reload system for quick, seconds-long testing updates.


New in October: Find community with The DSC

October was a big month for the ML industry, with more momentum than ever behind spooky-good models and novel applications

ML models

Build with OpenAI’s Whisper model in five minutes

Deploy OpenAI Whisper for free on Baseten instantly from our model library. Or stick around to learn how to package and deploy Whisper with Truss.


New in September: Increasing flexibility and robustness

The state of cutting-edge open-source ML models, a more flexible interface for invoking models, and robust application development workflows

ML models

How to deploy Stable Diffusion using Truss

Explore deploying the open-source Stable Diffusion model by Stability AI on Baseten. This walkthrough details the deployment process for those interested.