How Pipe’s data science team keeps up with rapid growth

Background

Pipe, a trading platform for recurring revenues, connects companies with recurring revenue streams to institutional  investors so they can access up-front capital without diluting ownership or taking out loans. Last valued at $2 billion, Pipe is growing rapidly, and so are its machine learning needs.

Leading the charge is Faaez Ul Haq, Pipe’s Head of Data Science, and his team of four data scientists. The team uses a combination of analytics, modeling, and optimization to power several key areas of Pipe’s platform.

The Challenge

While some components of models Ul Haq and his team develop are built in Python, Pipe’s backend infrastructure is in Go. To get models into production, the Data Science team would need to spin up all new infrastructure around Python. That means configuring Docker on VMs or Kubernetes, along with managing the day-to-day maintenance and DevOps that comes with self-hosting.

As a small team needing to move quickly, Ul Haq wanted to avoid this at all costs. “I want my team to focus on business outcomes,” said Ul Haq. “We are always looking for ways to minimize doing work that is not to our comparative advantage.” 

The Solution

Seeking an alternative, Ul Haq stumbled upon Baseten. With Baseten, the team could serve its core models with just a few lines of Python. Baseten owns the containerization and deployment, and its Kubernetes-based architecture meant the team didn’t need to worry about model performance even with increased traffic and more demanding models.

In addition, the team found Baseten’s flexibility and thoughtful details in developer ergonomics, like being able to store API keys securely via the UI, to be a step above the competition. “Baseten feels like a modern tool that is designed with the data scientist in mind,” said Ul Haq.

Using Baseten’s Custom Models, the Pipe team added logic to integrate their model into Pipe’s Go stack. When called, the service in Go hits Baseten’s API to instantly return a prediction. Predictions are also stored in a Postgres database to enable the team to debug and improve the model. It took only a few days to move from an offline model in a Jupyter notebook to one deployed in production.  

“Baseten is the ideal solution,” said Ul Haq. “It provides an easy way for us to host our models, iterate on them, and experiment without worrying about any of the DevOps involved.” 

The Results

With their first Baseten model in production, Pipe’s data science team is already looking to add value in additional areas.

For Ul Haq, this is just the beginning. From “the point that a customer plugs in their data sources all the way to matching them with investors on the buy side,” there are tons of modeling and optimization problems he believes his team can solve. 

“Data scientists feel empowered when they own the end-to-end lifecycle of their models,” said Ul Haq. “And with Baseten, my team can self-serve their models into production very, very quickly.”