New in July: A seamless bridge from model development to deployment

Check out our launch post in Towards Data Science!

Truss, a new model serving technology

Truss is open source on GitHub under the MIT license. It offers a bunch of features to integrate model deployment into your development loop:

🏎 Turns your Python model into a microservice with a production-ready API endpoint, no need for Flask or Django.

🎚 For most popular frameworks, includes automatic model serialization and deserialization.

🛍 Freezes dependencies via Docker to make your training environment portable.

🕰 Enables rapid iteration with local development that matches your production environment.

🗃 Encourages shipping parsing and even business logic alongside your model with integrated pre- and post-processing functions.

🤖 Supports running predictions on GPUs. (Currently limited to certain hardware, more coming soon)

🙉 Bundles secret management to securely give your model access to API keys.

For Baseten users, Truss is even more powerful. Platform features like GPU support, secret management, and model pulling and sharing all run through Truss, giving you more control over your deployed models. And our roadmap includes ideas for where we want to take Truss, but this is a collaborative project; we’ll build what you need!

Truss is a foundational technology for Baseten that will power the next sets of model deployment features and grow with your use cases. We’re excited to see what models you build and share with Truss!

Model deployment improvements with Truss

Thanks to Truss, you can now deploy models from more frameworks with the out-of-the-box baseten.deploy() command. This doesn’t require using Truss explicitly, it’s a free win from Baseten switching to Truss under the hood. New model frameworks supported include:

Hugging Face Transformers (built with the pipeline command)
XGBoost, an optimized distributed gradient boosting library
LightGBM, a gradient boosting framework that uses tree based learning algorithms

Here’s how simple creating, invoking, and deploying a Hugging Face model with Truss is: just eight lines of code, counting imports.

1from transformers import pipeline
2from truss import mk_truss
3import baseten
4 
5model = pipeline('fill-mask', model='bert-base-uncased')
6 
7tr = mk_truss(model, target_directory="bert_truss")
8 
9tr.docker_predict({"inputs": ["Donatello is a teenage mutant [MASK] turtle"]})
10 
11baseten.login("my-api-key")
12baseten.deploy(model, "Bert Base Uncased")

This month, we joined forces with Brittany, a product designer, and Nish, a software engineer. We might be looking for you, too! We’re hiring for a lead infrastructure engineer and a lead ML engineer.

We’ll venture back toward your inbox next month with another update!

Thanks all,

The team at Baseten

Subscribe to our newsletter

Stay up to date on model performance, GPUs, and more.

‌

New in July: A seamless bridge from model development to deployment

Truss, a new model serving technology

Model deployment improvements with Truss

Subscribe to our newsletter

Related Product posts

Introducing Baseten Embeddings Inference: The fastest embeddings solution available

Baseten Chains is now GA for production compound AI systems

New observability features: activity logging, LLM metrics, and metrics dashboard customization