Data + Curiosity: All About that Bayes with Chelsea Parlett-Pelleriti

In this episode of Data + Curiosity, Baseten Developer Advocate Jesse Mostipak talks with Chelsea Parlett-Pelleriti about Bayesian statistics, the machine learning competition #SLICED, and statscomm with memes. 

This post is an excerpt from the full interview and has been edited for length and clarity - you can watch the full episode below:

Could you give us an “explain it like I’m five” overview of what exactly Bayesian statistics is?

CHELSEA PARLETT-PELLERITI: People ask me this all the time and I never feel like I give a good answer to this because I am a practitioner of Bayesian statistics, but I am not an expert. But my favorite way to describe Bayesian statistics is actually Michael Betancourt (@betanalpha on Twitter). His pinned tweet says, What makes you Bayesian is quantifying uncertainty using probability.

What makes you Bayesian is quantifying uncertainty using probability

And so I could give you a bunch of different math definitions of how the way that we do tests and stuff like that – Bayesian statistics is a little different, but I think that really encapsulates it. So in Bayesian statistics you always hear about updating your beliefs, like you take from priors, which is what you believe before seeing the data, and then through math you update that prior with the information from your data and get your posterior, which is basically your updated belief.

So now that we combined the evidence in what you thought before, what do you think now? 

You’ve mentioned in the past that participating in #SLICED would be challenging as a Bayesian. Could you say more about why that is?

CHELSEA PARLETT-PELLERITI: I think that if you're really doing machine learning, you're not going to be using an inferential Bayesian model. But the ideas of Bayesian statistics really permeate through the field of machine learning. 

For instance, optimization like doing hyper parameter tuning. There's a lot of Bayesian ways of doing that. So you're basically incorporating prior information about what hyper parameters might be good or like how to find those hyper parameters.

The ideas of Bayesian statistics really permeate through the field of machine learning. 

And that can speed things up because even when you're doing machine learning, like your XGBoost or whatever, it can take a long time to train your models and tune the hyperparameters. And so using these ideas of bringing in prior information can be really, really helpful. And so those ideas, I think, come through more than like the type of Bayesian modeling, like inferential Bayesian modeling that researchers often use.

You’re known for making approachable and relatable stats memes that resonate with all kinds of people. What drives you to create this kind of content?

CHELSEA PARLETT-PELLERITI: I'm not like going into it being calculated. I think I am just weird and I like expressing the things that I'm learning or thinking about through memes, but it's like in the background, in my mind, is this idea that I want to get people excited about statistics, motivate them to learn and help them feel like they belong.

It's like in the background, in my mind, is this idea that I want to get people excited about statistics, motivate them to learn and help them feel like they belong.

And I think that memes are a really good way to do that. And like I've had a couple of people say, like, my goal is to learn enough statistics to understand your meme, and that just makes me so happy.

In closing

We’d love to hear your thoughts on this interview, know what your favorite Bayesian statistics meme is, and what you’re curious about! Let us know in the video comments – we can’t wait to hear from you!