Stories, updates, and other resources from Baseten.
Technical deep dive: Truss live reload
Truss’ key feature for iterative development is live reload. Without live reload, the upload-build-deploy loop for publishing models to production can take anywhere from 3 to 30 minutes. With live reload, it’s practically instantaneous. In this post, we describe implementing live reload.
Choosing the right horizontal scaling setup for high-traffic models
Horizontal scaling via replicas with load balancing is an important technique for handling high traffic to an ML model. Let’s examine three tips for understanding how to properly replicate your instances to save users time without wasting your money.
Serving four million Riffusion requests in two days
Seth and Hayk partnered with Baseten to host and serve Riffusion, which uses a tuned version of Stable Diffusion to generate audio spectrograms and interpret those images as music. Riffusion climbed to the top of Hacker News and handled over 4 million song requests in one day.
Send Slack messages with classified leads with Baseten
Many business processes benefit from integration with existing interfaces. Baseten’s Slack integration abstracts away all of the boilerplate code required to send a message to a Slack channel, and in this tutorial we’ll build a lead classifier with a pre-trained zero-shot classification model.
Store data in S3 with a data connection
While Baseten provides PostgreSQL tables for storing information from model runs, you may want to store your data in your own databases. Plus, for data ill-suited for relational databases, like large files and key-value pairs, you’ll want to use more appropriate technologies.
How Baseten is using "docs as code" to build best-in-class documentation
Docs as code is a philosophy of writing documentation with the same tools and practices as writing code. Adopting docs as code has lowered friction for engineers to contribute documentation, improving docs quality and saving time.