New in January 2023

Prompt: A solarpunk train in the mountains

Prediction logs show warnings and errors from model invocation.

Diagnose OOMs and other common issues faster with an improved logs experience

The only thing more frustrating than your model breaking is not knowing why it’s breaking.

To make it easier to root-cause issues during model deployment and invocation, we separated build logs from deployment and prediction logs. This matches the two-step process of deploying a model on Baseten:

The model serving environment is built as a docker image
The model image is run on Baseten’s infrastructure

The improved logs surface common issues like out of memory (OOM) errors, when the provisioned instance isn’t big enough to run the model. And if you run into an OOM error, fortunately the very next section of this email has everything you need to fix it!

Understand model resources and instance sizing

You just need an instance that’s a bit larger than your model. If the instance is too small, you’ll run into out-of-memory errors, but if your instance is oversized, you’re paying more for no benefit.

Last month, we launched the ability to set your model resources to handle more demanding models and higher traffic.

Here are two guides to understanding how to set the right resources that can handle whatever you throw at them without wasting money on unnecessary overhead:

Truss 0.2.0: Improved developer experience

Truss’ core functions

Creating an ML model is hard work, and the last thing anyone wants after they’ve finished creating the next awesome model is to slog through hours of work packaging the model with complex tooling. Truss is an open-source library that exists to be the most convenient way to package your ML model for deployment and, because we made it, it’s deeply integrated with Baseten.

Using Truss usually goes something like this:

Train your model
Create a Truss in your development environment from an in-memory model object
Edit your Truss to package environment variables, processing functions, secrets, and more
Load your modified Truss and deploy it to Baseten

In the latest minor version release, 0.2.0, we simplified the names of several core functions in Truss to create a cleaner interface. When you want to create a Truss, it’s now just truss.create(). Loading a Truss from a local directory? You guessed it: truss.load(). Truss is all about removing friction from model packaging, and its cleaned-up developer experience reflects that vision.

For a complete list of Truss changes version-to-version, consult the Truss release notes.

Manage model versions

A screenshot of the new version-first model management UI

Making an ML model isn’t a one-and-done endeavor. As your data and business needs change, you need to create new models. And the best way to manage these updated models is with model versions.

Three of the many use cases for model versions:

Deploy multiple different models to test them head-to-head on live data
Deploy an updated model trained on new data
Deploy an updated model to combat model drift

The refreshed user interface for models centers model versions. Previously in their own tab, model versions now have a dedicated sidebar to help you navigate different deployments of your model and review version statuses at a glance. And model health information and logs are mapped to the model version, so you always know exactly how much traffic each version is getting.

Thanks for reading! Before we go, three fun links posted in our #random Slack channel during January, presented without context:

See you next month!

— The team at Baseten

Subscribe to our newsletter

Stay up to date on model performance, GPUs, and more.

‌

New in January 2023

Diagnose OOMs and other common issues faster with an improved logs experience

Understand model resources and instance sizing

Truss 0.2.0: Improved developer experience

Manage model versions

Subscribe to our newsletter

Related Product posts

Introducing Baseten Embeddings Inference: The fastest embeddings solution available

Baseten Chains is now GA for production compound AI systems

New observability features: activity logging, LLM metrics, and metrics dashboard customization