Changelog | Page 2

See our latest feature releases, product improvements and bug fixes

Jan 8, 2024

Give names to model deployments

When deploying with Truss via truss push , you can now assign meaningful names to your deployments using the --deployment-name argument, making them easier to identify and manage. Here's an example:...

Dec 15, 2023

Updated defaults and language for autoscaling settings

Autoscaling lets your deployed models handle variable traffic while making efficient use of model resources. We’ve updated some language and default settings to make using autoscaling more intuitive....

Nov 10, 2023

Retry failed builds and deploys

You can now retry failed model builds and deploys directly from the model dashboard in your Baseten workspace. Model builds and deploys can fail due to temporary issues, like a network error while...

Oct 31, 2023

Overhauled model management experience

We've made some big changes to the model management experience to clarify the model lifecycle and better follow concepts you're already familiar with as a developer. These changes aren't breaking...

Oct 27, 2023

Add workspace API keys for more granular permissions

We added workspace API keys to give you more control over how you call models, especially in production environments. There are now two types of API keys on Baseten: Personal keys are tied to your...

Oct 16, 2023

New model IDs for deployed models

The model IDs for some models deployed on Baseten have been changed. This is not a breaking change. All existing model invocations using the old model IDs will continue to be supported. You do not...

Oct 9, 2023

Measure end-to-end response time vs inference time

On the model metrics tab, you can now use the dropdown menu to toggle between two different views for model inference time: End-to-end response time includes time for cold starts, queuing, and...

Sep 29, 2023

Track both active and starting up replicas

The replica count chart on the model metrics page is now broken out into “active” and “starting up” replicas. An active replica has loaded the model for inference and is actively responding to...

Sep 25, 2023

Wake scaled to zero models

ML models deployed on Baseten can automatically scale to zero when not in use so that you’re not paying for unnecessary idle GPU time. When a scaled to zero model is invoked, it spins up on a new...

Sep 7, 2023

Get 500 response code on model invocation error

Models deployed on Baseten using Truss 0.7.1 or later can now send the 500 response code when there is an error during model invocation. This change only affects newly deployed models. Any exception...

123…10