Weekly round-up: Deploy models with any interface

By default, when you deploy a model to Baseten, like in this XGBoost classifier example, the deployed model expects to receive a dictionary with the key inputs and a list of input values, and will return a dictionary with the key predictions and a list of model results.

model_input = {"inputs": [[0, 0, 0, 0, 0, 0]]}
model_output = {'predictions': [0.21339938044548035]}

Until now, that default behavior has not been changeable. All models deployed to Baseten had to follow that spec. However, this interface was too inflexible, so as of the most recent version of the Baseten Python package (0.2.7), you can set your own interface for your models.

Setting your model interface

Baseten uses Truss under the hood for model deployment. You can customize your model’s interface by editing the predict function in models/model.py in Truss. The auto-generated predict function for the aforementioned XGBoost example looks like this:

Modify this function to parse whatever request and response you want, and remember that Truss also supports pre- and post-processing functions that can further modify input and output when more complicated parsing is needed.

def predict(self, request: Dict) -> Dict[str, List]:
    response = {}
    inputs = request["inputs"]
    dmatrix_inputs = xgb.DMatrix(inputs)
    result = self._model.predict(dmatrix_inputs)
    response["predictions"] = result
    return response

Backwards compatibility

This change does not modify the behavior of existing deployed models, nor the default behavior of future models. However, it does change how deployed models are invoked through the Baseten Python client.

Previously, the predict() function wrapped its argument in a dictionary with the inputs key. Now that said key is not required, the predict() function passes its inputs as-is, which means you have to enter the entire model input yourself.

Before (1.0 spec):


Now (2.0 spec):

baseten.predict({"inputs": [[0,0,0,0,0,0]]})

The syntax for invoking a model via an API call has not changed.

However, to prevent breaking existing scripts using baseten.predict, a new spec_version flag is now included in Truss. This parameter is set to 2.0 by default for all new models, so they will use the new input spec, but all existing models will continue to function exactly as they have been on the 1.0 spec. You can upgrade your model to the latest interface spec by changing the flag in the config.yaml file in Truss.

Enjoy this unrestricted interface by installing the newest versions of Truss and the Baseten client.

pip install --upgrade baseten truss

Pass a Truss to baseten.deploy()

We also cleaned up the deployment experience in the Baseten Python client. You no longer have to use different functions to deploy an in-memory model versus a model packaged as a Truss. Whatever you have, toss it into baseten.deploy() and we’ll take care of it.

Plus, when you deploy an in-memory model, the deploy function now gives you insight into what is happening to package that model, including the path to where the auto-generated Truss folder lives. This is useful if you want to change something about your deployed model’s behavior (like updating the interface) after deploying the model, just edit the truss and pass it into baseten.deploy() to ship a new version!

INFO Autogenerating Truss for your model,
find more about Truss at https://truss.baseten.co/
INFO You can find your auto generated Truss at

🎃 The pumpkin patch

This week’s small-but-mighty changes to bring more magic to your models!

Tab complete bindings: In the view builder, you can create a binding tag by typing “{{“ and you can now close the binding by pressing “Tab” after selecting the data you want to use in the binding.

Model building banners: When you deploy a new version of a model to your Baseten workspace, you’ll see a banner on other versions of the model letting you know that the new version is building.

A screenshot of the model deployment banner showing a new version of the bert-base-uncased model building