Go from machine learning models to full-stack applications

Building a full-stack application for an ML model in just a few minutes.

Machine learning has matured at mind-boggling speed. The decrease in cost of compute has enabled innovative model architectures that in turn have resulted in powerful state-of-the-art models with real utility. We are no longer technically limited by what machine learning can achieve.

But in my experience, there’s still a huge gap between being able to answer a business problem with ML and actually implementing an ML solution. We saw this first-hand at Yelp, Gumroad, and Clover Health, and we’ve heard it repeatedly from practitioners at organizations of every shape and size.

In 2022, the job of a Data Scientist is like being a Product Owner, Developer, Designer, and Data Scientist all in one. First you need to find use cases. So you partner with stakeholders, gather requirements, and prioritize accordingly. Then comes the model building — preparing data, training a model, and iterating to maximize performance. More than enough to keep your hands full, right?

The modern Data Scientist wears many hats… maybe too many hats.

But models sitting on your local machine aren’t enough. They need to be integrated back into the business: served behind scalable APIs, given read and write access to databases and, if you’re building for business users, delivered alongside a UI that lets them see and interact with your predictions.

This staggering amount of work is best summarized in this diagram by Lj Miranda, an ML Engineer at spaCy:

The ML lifecycle. Image by Lj Miranda.

Shipping an ML solution from start to finish involves two loops: model development (the right loop) and model delivery (the left loop). Model development is typically well-understood by Data Scientists. But as we’ve mentioned, a model alone isn’t sufficient. It needs to be packaged and integrated as a software component. Enter model delivery, a completely distinct set of tasks that requires software engineering resources or know-how.

In short, being a Data Scientist is a hard job. That’s why best-in-class teams at well-resourced organizations have built entire ML platform teams to spread the burden (a recent conversation with an ML lead at a FAANG revealed they had over 100 engineers to support their 100 scientists within just one product area). But for the majority of data science teams, this approach is both prohibitively expensive and hard to hire for.

Instead, data science teams take it upon themselves to act as both scientist and engineer. For these folks, we found that model development could be done relatively quickly — less than 4 weeks in many cases. But model delivery could take an additional 8 to 16 weeks.

The timeline for a typical ML project from start to finish

Although ML has matured rapidly, it’s clear that the supporting infrastructure and tooling needed to actually implement it is still abysmally immature.

So, I joined forces with my friends and former teammates Amir and Philip and began to wonder: what if we could productize all the infrastructure, server-side, and front-end work involved in model delivery so data scientists can focus on doing what they love: training models and solving business problems?

Introducing Baseten

With an amazing group of early customers, including Data Scientists and ML Engineers at Patreon, Pipe, SIL, and Primer, we spent the past two years building Baseten, an ML Application Builder for Data Scientists. Baseten makes it easy to deploy machine learning models and serve them into new and existing business processes with scalable APIs and interactive applications without needing to learn anything about containers, Flask, and React.

Baseten streamlines the model delivery loop in the ML lifecycle

How does it work?

1. Serve models easily and quickly

Deploy a model in Baseten right from your Jupyter notebook with a few lines of Python

We believe Data Scientists shouldn’t have to spend countless hours wrangling Docker and AWS to get their models behind scalable, robust APIs. We’ve built simple APIs that allow you to get models served and an intuitive UI to manage, monitor, and configure infrastructure when needed.

Baseten supports most major modeling libraries including scikit-learn, Pytorch, and Tensorflow. If you’re building something with a custom framework, our custom model deployment mechanism allows you to deploy infinitely complex mechanisms.

2. Integrate with other services and data stores

Integrate your model with business logic and serve them behind custom, scalable APIs

Most models need some pre-processing and post-processing at the time of inference, and additional backend services get built to call predictions and push predictions to other data stores and tools. All this code can now be written, tested and run with Baseten without worrying about standing up servers and setting up RESTful APIs. Users have control over both the system and Python environments, and the endpoints are set up to scale horizontally.

3. Design interactive views for business users

Is lobster bisque a soup or a cocktail? Baseten’s drag-and-drop builder lets you build interactive UI in minutes, so your stakeholders never have to be left wondering again

The Baseten drag-and-drop view builder allows teams to build interactive, stateful interfaces for model evaluation (and showing off :P), data labeling, and internal tools, without having to learn HTML, CSS, Javascript, or React. UI components natively integrate with models deployed on and backends built with Baseten and can be shared publicly or within organizations with a click of a button.

4. Ship full-stack applications

Any ML model that needs to interact with people or other services can be built in Baseten. Including this photo restoration app, shown here restoring a picture of Muhammad Ali.

Ultimately, our goal is to empower Data Scientists to ship faster. By drastically reducing the barriers to getting models out of notebooks and into the hands of users, the rate of iteration drastically increases. Which in turn helps the business derive more value from machine learning.

Our early customers are already leveraging Baseten for a wide variety of applications. Just to name a few:

Data labeller for user-generated content: Patreon’s Trust & Safety team label user-generated content using a web app built on Baseten.
User verification application: Primer built a full-stack application so their team could review flagged sign-ups and prevent bad actors from joining.
Diagnostic suite to assess translation quality: SIL’s data science team use Baseten to serve their impressive set of models that assess factors like readability, comprehensibility, and similarity in a translation.

The above are just a few examples. Really, any ML model that needs to interact with people or other services is a fantastic fit for Baseten.

Try it out today

After months of iterating with early users, we’re excited to finally share Baseten’s public beta to allow all data scientists to deploy models and build full-stack applications.

We hope you try it out — sign up today!

Subscribe to our newsletter

Stay up to date on model performance, GPUs, and more.

‌

Go from machine learning models to full-stack applications

Introducing Baseten

How does it work?

1. Serve models easily and quickly

2. Integrate with other services and data stores

3. Design interactive views for business users

4. Ship full-stack applications

Try it out today

Subscribe to our newsletter

Related Product posts

Introducing Baseten Embeddings Inference: The fastest embeddings solution available

Baseten Chains is now GA for production compound AI systems

New observability features: activity logging, LLM metrics, and metrics dashboard customization