Baseten makes it easy to build complex user-facing applications with machine learning models. Often, step one of building these applications is deploying a machine learning model. Baseten supports deploying most types of machine learning models. In this post, we’ll go over deploying a simple fast.ai model and end up with a REST API to call the model.
Training and serializing models
To get started, let’s create a fast.ai model to deploy (note that this step is agnostic to Baseten, we’re using a simple training script just to have a model to deploy). Our model is a simple CNN that predicts the type of animal in a photo. We’ll train it on pet images using the create_model method, and then do some simple serialization using joblib. We now have a serialized model stored in a binary to upload to Baseten.
Creating a requirements.txt
Next, we’ll define the requirements that we need to load and run our model. In this case we only explicitly depend on fast.ai, joblib, and Pillow so we add those to a file called requirements.txt.
Writing a Python class for inference
Fast.ai models are deployed to Baseten using our custom model interface. The most important parts of this class are the predict and load methods. The load method gets called when the model’s container comes up and the predict method gets served by the Baseten inference API.
Putting it all together
With these pieces in place, we can deploy the model to Baseten's infrastructure with the Baseten client. The Baseten client library is pip installable and can be installed wherever you do your work (e.g. local Jupyter notebook, Google Colab Notebook, Databricks, or SageMaker Studio).
Once Baseten has been installed, we’ll simply import it and deploy the model with its files (including the serialized model) as well as the requirements.txt.
The deploy_custom function uploads the model to Baseten where a container is built and deployed within our infrastructure. After a few moments, our model will be ready to use behind a REST API or within a Baseten application.
We’d love to hear what your experience has been deploying models with Baseten — drop us a line!
Want to deploy your own models? Sign up for Baseten for free!
Fine-tune FLAN-T5 on Blueprint today!
You can now fine-tune FLAN-T5, an instruction-tuned text-to-text transformer model developed by Google, on Blueprint!