A checklist for switching to open source ML models
Switching from a closed source ecosystem where you consume ML models from API endpoints to the world of open source ML models can seem intimidating. But this checklist will give you all of the resources you need to make the leap.
Pick an open source model
The biggest advantage of the open source ecosystem in ML is the sheer number and variety of models to choose from. But that amount of choice can be overwhelming. Here are some alternatives to closed-source models to get you started:
Large language models (LLMs):
Text embedding models:
Closed source: ada-002
Open source: jina-embeddings-v2
Speech to text (audio transcription) models:
Closed source: Whisper from the Audio API
Open source: Whisper on your own infra
Text to speech (audio generation) models:
Closed source: Audio API text to speech endpoint
Open source: Bark
Choose a GPU for model inference
Inference for most generative models like LLMs requires GPUs. Picking the right GPU is essential: you want the least expensive GPU powerful enough to run the model with acceptable performance.
For a 7 billion parameter LLM like Mistral 7B, you usually want an A10. A10s also give great performance for Whisper and Bark, but these smaller models can also fit on the less-expensive T4, though with longer generation times. And text embedding models don’t need a GPU at all, though a T4 can accelerate inference.
Here are some buyer’s guides to GPUs:
Find optimizations relevant to your use case
If you’re just experimenting with open source models or you need to get something in production yesterday, you can skip this step. But one of the most powerful things that switching to open source models unlocks is the ability to optimize a balance of latency, throughput, quality, and cost to align with your use case.
Get started with:
Deploy your model
Once you have your model and hardware configuration, it’s time to deploy. You can deploy a curated selection of models from our model library in just a couple of clicks or use Truss, our open source model packaging framework, to get any model up and running behind an API endpoint.
Dive into deployment with:
Open source models in the Baseten model library.
A quickstart guide for Truss, an open source model packaging framework.
Integrate your new model endpoint
Once you’ve deployed your model, you’ll need to use the model endpoint to integrate your model into your application.
Baseten has guides for:
Another great way to build with LLMs is to use a tool like LangChain as an abstraction on top of your model endpoint, which helps with switching between models, APIs, and providers.
If you want to dive deeper, check out our guide to open source alternatives for ML models. Wherever you are in your journey from evaluation to adoption for open source ML models, we’re here to help at support@baseten.co.