#113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag 😉

Takeaways:

Bayesian statistics is a powerful framework for handling complex problems, making use of prior knowledge, and excelling with limited data.
Bayesian statistics provides a framework for updating beliefs and making predictions based on prior knowledge and observed data.
Bayesian methods allow for the explicit incorporation of prior assumptions, which can provide structure and improve the reliability of the analysis.
There are several Bayesian frameworks available, such as PyMC, Stan, and Bambi, each with its own strengths and features.
PyMC is a powerful library for Bayesian modeling that allows for flexible and efficient computation.
For beginners, it is recommended to start with introductory courses or resources that provide a step-by-step approach to learning Bayesian statistics.
PyTensor leverages GPU acceleration and complex graph optimizations to improve the performance and scalability of Bayesian models.
ArviZ is a library for post-modeling workflows in Bayesian statistics, providing tools for model diagnostics and result visualization.
Gaussian processes are versatile non-parametric models that can be used for spatial and temporal data analysis in Bayesian statistics.

Chapters:

00:00 Introduction to Bayesian Statistics

07:32 Advantages of Bayesian Methods

16:22 Incorporating Priors in Models

23:26 Modeling Causal Relationships

30:03 Introduction to PyMC, Stan, and Bambi

34:30 Choosing the Right Bayesian Framework

39:20 Getting Started with Bayesian Statistics

44:39 Understanding Bayesian Statistics and PyMC

49:01 Leveraging PyTensor for Improved Performance and Scalability

01:02:37 Exploring Post-Modeling Workflows with ArviZ

01:08:30 The Power of Gaussian Processes in Bayesian Modeling

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.

Links from the show:

Original episode on the Super Data Science podcast: https://www.superdatascience.com/podcast/bayesian-methods-and-applications-with-alexandre-andorra
Advanced Regression with Bambi and PyMC: https://www.intuitivebayes.com/advanced-regression
Gaussian Processes: HSGP Reference & First Steps: https://www.pymc.io/projects/examples/en/latest/gaussian_processes/HSGP-Basic.html
Modeling Webinar – Fast & Efficient Gaussian Processes: https://www.youtube.com/watch?v=9tDMouGue8g
Modeling spatial data with Gaussian processes in PyMC: https://www.pymc-labs.com/blog-posts/spatial-gaussian-process-01/
Hierarchical Bayesian Modeling of Survey Data with Post-stratification: https://www.pymc-labs.com/blog-posts/2022-12-08-Salk/
PyMC docs: https://www.pymc.io/welcome.html
Bambi docs: https://bambinos.github.io/bambi/
PyMC Labs: https://www.pymc-labs.com/
LBS #50 Ta(l)king Risks & Embracing Uncertainty, with David Spiegelhalter: https://learnbayesstats.com/episode/50-talking-risks-embracing-uncertainty-david-spiegelhalter/
LBS #51 Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton: https://learnbayesstats.com/episode/51-bernoullis-fallacy-crisis-modern-science-aubrey-clayton/
LBS #63 Media Mix Models & Bayes for Marketing, with Luciano Paz: https://learnbayesstats.com/episode/63-media-mix-models-bayes-marketing-luciano-paz/
LBS #83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo: https://learnbayesstats.com/episode/83-multilevel-regression-post-stratification-electoral-dynamics-tarmo-juristo/
Jon Krohn on YouTube: https://www.youtube.com/JonKrohnLearns
Jon Krohn on Linkedin: https://www.linkedin.com/in/jonkrohn/
Jon Krohn on Twitter: https://x.com/JonKrohnLearns

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.

Transcript

Speaker: 00:00:02

In this special episode, the roles are reversed.

00:00:08,910 --> 00:00:14,310

as I step into the guest seat to explore the intriguing world of Bayesian stats.

00:00:14,310 --> 00:00:26,770

Originally aired as episode 793 on the fantastic Super Data Science podcast hosted by John

Crone, this conversation is too good not to share with all of you here on learning

00:00:26,770 --> 00:00:27,970

Bayesian statistics.

00:00:27,970 --> 00:00:39,030

So join us as we delve into how Bayesian methods elegantly handle complex problems, make

efficient use of prior knowledge and excel with limited data.

00:00:39,104 --> 00:00:48,524

the foundational concepts of patient statistics, highlighting their distinct advantages

over traditional methods, particularly in scenarios fraught.

00:00:48,524 --> 00:00:50,615

with uncertainty and sparse data.

00:00:50,615 --> 00:00:59,939

A highlight of our discussion is the application of Gaussian processes where I explain

their versatility in modeling complex, non -linear relationships in data.

00:00:59,939 --> 00:01:09,884

I share a fascinating case study involving an NGO in Estonia illustrating how Bayesian

approaches can transform limited polling data into profound insights.

00:01:09,884 --> 00:01:18,448

So whether you're a seasoned statistician or just starting out, this episode is packed

with practical advice on embracing Bayesian stats and of course

00:01:18,448 --> 00:01:23,268

I strongly recommend you follow the Super Data Science Podcast.

00:01:23,268 --> 00:01:28,938

It's really a can't -miss resource for anyone passionate about the power of data.

00:01:28,938 --> 00:01:37,308

This is Learning Vision Statistics, episode 113, originally aired on the Super Data

Science Podcast.

00:01:39,022 --> 00:01:47,119

Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,

the projects, and the people who make it possible.

00:01:47,119 --> 00:01:49,290

I'm your host, Alex Andorra.

00:01:49,290 --> 00:01:53,493

You can follow me on Twitter at alex -underscore -andorra.

00:02:08,524 --> 00:02:09,354

like the country.

00:02:09,354 --> 00:02:13,576

For any info about the show, learnbasedats .com is Laplace to be.

00:02:13,576 --> 00:02:20,758

Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on

Patreon, everything is in there.

00:02:20,758 --> 00:02:22,688

That's learnbasedats .com.

00:02:22,688 --> 00:02:33,041

If you're interested in one -on -one mentorship, online courses, or statistical

consulting, feel free to reach out and book a call at topmate .io slash alex underscore

00:02:33,041 --> 00:02:33,722

and dora.

00:02:33,722 --> 00:02:35,222

See you around, folks.

00:02:35,222 --> 00:02:37,105

and best patient wishes to you all.

00:02:37,105 --> 00:02:44,198

And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can

help bring them to life.

00:02:44,198 --> 00:02:46,661

Check us out at pimc -labs .com.

00:02:50,222 --> 00:02:51,722

Hello my dear Vagans!

00:02:51,722 --> 00:02:57,762

A quick note before today's episode, STANCON 2024 is approaching!

00:02:57,762 --> 00:03:09,942

It's in Oxford, UK this year from September 9 to 13 and it's shaping up to be an

incredible event for anybody interested in statistical modeling and vaginal inference.

00:03:09,942 --> 00:03:20,034

Actually, we're currently looking for sponsors to help us offer more scholarships and make

STANCON more accessible to everyone and we also encourage you

00:03:20,034 --> 00:03:22,535

to buy your tickets as soon as possible.

00:03:22,535 --> 00:03:30,659

Not only will this help with making a better conference, but this will also support our

scholarship fund.

00:03:30,659 --> 00:03:38,764

For more details on tickets, sponsorships, or community involvement, you'll find the

Stencon website in the show notes.

00:03:38,764 --> 00:03:40,224

We're counting on you.

00:03:40,224 --> 00:03:42,215

Okay, on to the show now.

00:03:46,005 --> 00:03:48,267

Alex, welcome to the Super Data Science podcast.

00:03:48,267 --> 00:03:49,668

I'm delighted to have you here.

00:03:49,668 --> 00:03:51,419

Such an experienced podcaster.

00:03:51,419 --> 00:03:57,033

It's going to be probably fun for you to get to be the guest on the show today.

00:03:57,293 --> 00:03:57,684

Yeah.

00:03:57,684 --> 00:03:58,934

Thank you, John.

00:03:59,375 --> 00:04:02,097

First, thanks a lot for having me on.

00:04:02,097 --> 00:04:04,398

I knew about your podcast.

00:04:04,459 --> 00:04:09,512

I was both honored and delighted when I got your email to come on the show.

00:04:09,512 --> 00:04:13,421

I know you have had very...

00:04:13,421 --> 00:04:17,201

Honorable guests before like Thomas Vicky.

00:04:17,441 --> 00:04:21,981

so I will try to, to, to be on board, but, I know that it's going to be hard.

00:04:21,981 --> 00:04:22,571

Yeah.

00:04:22,571 --> 00:04:27,411

Thomas, your co -founder at, Pi MC labs is, was indeed a guest.

00:04:27,411 --> 00:04:30,381

was on episode number 585.

00:04:31,181 --> 00:04:32,681

but that is not what brought you here.

00:04:32,681 --> 00:04:34,141

Interestingly, the connection.

00:04:34,141 --> 00:04:38,691

So you asked me before we started recording how I knew about you.

00:04:38,691 --> 00:04:41,781

And so a listener actually suggested to you as a guest.

00:04:41,781 --> 00:04:42,401

So.

00:04:42,401 --> 00:04:43,481

Doug McLean.

00:04:43,481 --> 00:04:44,752

Thank you for the suggestion.

00:04:44,752 --> 00:04:49,504

Doug is lead data scientist at Tesco bank in the UK.

00:04:49,704 --> 00:04:54,166

And he reached out to me and said, can I make a suggestion for a guest?

00:04:54,166 --> 00:04:58,018

Alex Andora, like the country, I guess you say that you say that.

00:04:58,018 --> 00:05:01,459

Cause he put it in quotes.

00:05:01,459 --> 00:05:06,151

He's like, Andora, like the country hosts the learning patient statistics podcast.

00:05:06,151 --> 00:05:09,452

It's my other all time favorite podcast.

00:05:09,452 --> 00:05:10,603

So there you go.

00:05:10,603 --> 00:05:11,593

my God.

00:05:11,681 --> 00:05:13,543

Doug, I'm blushing.

00:05:13,543 --> 00:05:24,013

says he'd be a fab guest for your show and not least because he moans from time to time

about not getting invited onto other podcasts.

00:05:24,935 --> 00:05:25,815

Did I?

00:05:25,815 --> 00:05:26,976

my God.

00:05:27,757 --> 00:05:28,738

I don't remember.

00:05:28,738 --> 00:05:35,777

But maybe that was part of a secret plan, Maybe a secret marketing LBS plan and well.

00:05:35,777 --> 00:05:37,557

That works perfectly.

00:05:37,598 --> 00:05:42,499

When I read that, I immediately reached out to you to see if you'd want to go, but that

was so funny.

00:05:42,499 --> 00:05:49,001

And he does say, says, seriously though, he'd make a fab guest for his wealth of knowledge

on data science and on Bayesian statistics.

00:05:49,001 --> 00:05:53,732

And so, yes, we will be digging deep into Bayesian statistics with you today.

00:05:54,122 --> 00:06:02,804

you're the co -founder and principal data scientist of the popular Bayesian statistical

modeling platform, PI MC, as we already talked about with your co -founder Thomas Wiki.

00:06:03,004 --> 00:06:04,115

That is an excellent episode.

00:06:04,115 --> 00:06:05,457

If you want to go back to that and get.

00:06:05,457 --> 00:06:08,298

different perspective, obviously different questions we've made sure.

00:06:08,298 --> 00:06:13,129

But so if you're really interested in Bayesian statistics, that is a great one to go back

to.

00:06:13,129 --> 00:06:23,603

Yeah, in addition to that, you obviously also have the Learning Bayesian Stats podcast,

which we just talked about, and you're an instructor on the educational site, Intuitive

00:06:23,603 --> 00:06:24,683

Bayes.

00:06:24,783 --> 00:06:27,384

So tons of Bayesian experience.

00:06:27,384 --> 00:06:31,895

Alex, through this work, tell us about what

00:06:31,895 --> 00:06:35,746

Bayesian methods are and what makes them so powerful and versatile?

00:06:35,746 --> 00:06:37,086

Yeah.

00:06:37,086 --> 00:06:39,177

so first, thanks a lot.

00:06:39,177 --> 00:06:44,818

Thanks a lot, dog, for the recommendation and for listening to the show.

00:06:44,818 --> 00:06:47,949

am, I am absolutely honored.

00:06:48,570 --> 00:06:53,161

and, yeah, go and listen again to Thomas's episode.

00:06:53,161 --> 00:06:54,401

Thomas is always a great guest.

00:06:54,401 --> 00:06:58,012

So I definitely recommend anybody to, to go and listen to him.

00:06:58,752 --> 00:07:01,373

now what about Bayes?

00:07:01,911 --> 00:07:02,401

Yeah.

00:07:02,401 --> 00:07:08,385

You know, it's been a long time since someone has asked me that, because I have a Bayesian

podcast.

00:07:08,385 --> 00:07:12,067

Usually it's quite clear I'm doing that.

00:07:12,067 --> 00:07:15,428

people are like afraid to ask it at some point.

00:07:15,428 --> 00:07:27,935

So instead of giving you kind of like a, because our two avenues here, usually I could

give you the philosophical answer and why epistemologically Bayes stats makes more sense.

00:07:28,545 --> 00:07:30,296

but I'm not going to do that.

100

00:07:30,296 --> 00:07:32,147

That sounds so interesting.

101

00:07:32,770 --> 00:07:35,551

Yeah, it is, we can go into that.

102

00:07:35,551 --> 00:07:40,495

I think a better introduction is just a practical one.

103

00:07:40,495 --> 00:07:54,204

And that's the one that most people get to know at some point, which is like you're

working on something and you're interested in uncertainty estimation and not only in the

104

00:07:54,204 --> 00:08:01,629

point estimates and your data are crap and you don't have a lot of them and they are not

reliable.

105

00:08:01,837 --> 00:08:03,677

What do you do?

106

00:08:03,677 --> 00:08:08,257

And that happens to a lot of PhD students.

107

00:08:08,257 --> 00:08:12,857

That happened to me when I started trying to do electoral forecasting.

108

00:08:12,857 --> 00:08:20,797

was at the time working at the French Central Bank doing something completely different

from what I'm doing today.

109

00:08:20,917 --> 00:08:30,157

But I was writing a book about the US at the time, 2016 it was, and it was a pretty

consequential election for the US.

110

00:08:30,157 --> 00:08:31,661

was following it.

111

00:08:31,661 --> 00:08:33,071

really, really closely.

112

00:08:33,071 --> 00:08:39,821

And I remember it was July 2016 when I discovered 538 models.

113

00:08:39,981 --> 00:08:43,281

And then the nerd in me was awoken.

114

00:08:43,281 --> 00:08:46,521

It was like, my God, this is what I need to do.

115

00:08:46,521 --> 00:08:52,761

You know, that's my way of putting more science into political science, which was my

background at the time.

116

00:08:52,761 --> 00:08:57,197

And when you do electoral forecasting polls are extremely noisy.

117

00:08:57,197 --> 00:09:01,358

They are not a good representation of what people think, but they are the best ones we

have.

118

00:09:01,358 --> 00:09:04,699

are not a lot of them, at least in France, in the US much more.

119

00:09:04,699 --> 00:09:05,539

It's limited.

120

00:09:05,539 --> 00:09:08,140

It's not a reliable source of data basically.

121

00:09:08,140 --> 00:09:14,162

And you also have a lot of domain knowledge, which in the Bayesian Royal realm, we call

prior information.

122

00:09:14,162 --> 00:09:16,763

And so that's a perfect setup for Bayesian stats.

123

00:09:16,763 --> 00:09:20,584

So that's basically, I would say what Bayesian stats is.

124

00:09:20,584 --> 00:09:24,385

And that's the powerful, the power of it.

125

00:09:24,425 --> 00:09:36,168

You don't have to rely only on the data because sure you can let the data speak for

themselves, but what if the data are unreliable?

126

00:09:36,788 --> 00:09:43,790

Then you need something to guard against that and patient stats are a great way of doing

that.

127

00:09:43,970 --> 00:09:46,011

And the cool thing is that it's a method.

128

00:09:46,011 --> 00:09:52,123

It's like you can apply that to any topic you want, any field you want.

129

00:09:52,123 --> 00:09:53,133

that's what...

130

00:09:53,621 --> 00:09:59,803

I've done at PMC Labs for a few years now with all the brilliant guys who are over there.

131

00:10:00,103 --> 00:10:05,464

You can do that for marketing, for electoral forecasting, of course.

132

00:10:06,225 --> 00:10:16,198

Agriculture, that was quite ironic when we got some agricultural clients because

historically, agriculture is like the field of frequency statistics.

133

00:10:16,198 --> 00:10:21,279

That's how Ronald Fisher developed the p -value, the famous one.

134

00:10:21,345 --> 00:10:24,686

So when we had that, we're like, yes, we got our revenge.

135

00:10:25,307 --> 00:10:30,059

And of course, it's also used a lot in sports, sports modeling, things like that.

136

00:10:30,059 --> 00:10:33,291

So yeah, it's like that's the practical introduction.

137

00:10:33,291 --> 00:10:33,621

Nice.

138

00:10:33,621 --> 00:10:33,921

Yeah.

139

00:10:33,921 --> 00:10:45,216

A little bit of interesting history there is that sub -Asian statistics is an older

approach than the frequentist statistics that is so common and is the standard that is

140

00:10:45,216 --> 00:10:46,913

taught in college, so much so

141

00:10:46,913 --> 00:10:48,474

that is just called statistics.

142

00:10:48,474 --> 00:11:07,951

You do an entire undergrad in statistics and not even hear the word Bayesian because

Fisher so decidedly created this monopoly of this one kind of approach, which for me,

143

00:11:07,951 --> 00:11:15,417

learning for Quenta statistics say, I think I guess it was first year undergrad in science

that I studied and

144

00:11:16,173 --> 00:11:21,893

in that first year course, that idea of a P value always seemed odd to me.

145

00:11:22,053 --> 00:11:24,373

Like how is it that there's this art?

146

00:11:24,373 --> 00:11:35,333

This is such an arbitrary threshold of significance to have it be, you know, that this is

a one in 20 chance or less that this would be observed by chance alone.

147

00:11:35,573 --> 00:11:45,135

And this means that therefore we should rely on it, especially as we are in this era of

large data sets and larger and larger and larger data sets.

148

00:11:45,301 --> 00:11:56,766

You can have no meaningful if with very large data sets like we typically deal with today,

no matter, you're always going to get a significant P value because the slightest tiny

149

00:11:56,766 --> 00:12:02,198

change, if you take, you know, web scale data, everything's going to have be statistically

significant.

150

00:12:02,198 --> 00:12:03,588

Nothing won't be.

151

00:12:04,849 --> 00:12:06,360

so it's such a weird paradigm.

152

00:12:06,360 --> 00:12:12,152

And so that was discovering Bayesian statistics and machine learning as well.

153

00:12:12,752 --> 00:12:14,445

And seeing how

154

00:12:14,445 --> 00:12:17,965

Those areas didn't have P values interested me in both of those things.

155

00:12:17,965 --> 00:12:20,745

it's a, yeah, Fisher.

156

00:12:21,005 --> 00:12:21,865

It's interesting.

157

00:12:21,865 --> 00:12:30,005

mean, I guess with small data sets, eight, 16, that kind of scale, guess it kind of made

some sense.

158

00:12:30,265 --> 00:12:37,605

And you know, you pointed out there, I think it's this prior that makes Bayesian

statistics so powerful being able to incorporate prior knowledge, but simultaneously

159

00:12:37,605 --> 00:12:40,685

that's also what makes for Quentus uncomfortable.

160

00:12:40,685 --> 00:12:43,945

They they're like, we want only the data.

161

00:12:43,945 --> 00:12:56,329

As though, you know, the particular data that you collect and the experimental design,

there are so many ways that you as the human are influencing, you know, there's no purity

162

00:12:56,329 --> 00:12:57,649

of data anyway.

163

00:12:57,649 --> 00:13:05,611

And so priors are a really elegant way to be able to adjust the model in order to point it

in the right direction.

164

00:13:05,611 --> 00:13:11,233

And so a really good example that I like to come to with Bayesian statistics is that you

can

165

00:13:11,969 --> 00:13:20,695

You can allow some of your variables in the model to tend towards wider variance or

narrower variance.

166

00:13:20,695 --> 00:13:30,622

So if there are some attributes of your model where you're very confident, where you know

this is like, you know, this is like a physical fact of the universe.

167

00:13:30,622 --> 00:13:35,625

Let's just have a really narrow variance on this and the model won't be able to diverge

much there.

168

00:13:35,625 --> 00:13:41,055

But that then gives a strong focal point within the model.

169

00:13:41,055 --> 00:13:49,964

around which the other data can make more sense, the other features can make more sense,

and you can allow those other features to have wider variance.

170

00:13:49,964 --> 00:13:58,101

And so, I don't know, this is just one example that I try to give people when they're not

sure about being able to incorporate prior knowledge into a model.

171

00:13:58,142 --> 00:14:02,376

Yeah, yeah, no, these are fantastic points, John.

172

00:14:02,376 --> 00:14:05,933

So, yeah, to build a net, I'm...

173

00:14:05,933 --> 00:14:07,733

I'm a bit, of course, I'm a nerd.

174

00:14:07,733 --> 00:14:09,253

So I love the history of science.

175

00:14:09,253 --> 00:14:11,573

I love the epistemological side.

176

00:14:12,233 --> 00:14:19,453

A very good book on that is Bernoulli's Fallacy by Aubrey Clayton.

177

00:14:19,453 --> 00:14:21,073

Definitely recommend his book.

178

00:14:21,073 --> 00:14:24,193

He was on my podcast, episode 51.

179

00:14:24,193 --> 00:14:27,893

So if people want to give that a listen.

180

00:14:27,893 --> 00:14:30,333

Did you just pull that 51 out from memory?

181

00:14:30,477 --> 00:14:34,047

Yeah, yeah, I kind of know like, but I have less episodes than you.

182

00:14:34,047 --> 00:14:36,647

So it's like, you know, each episode is like kind of my baby.

183

00:14:36,647 --> 00:14:39,237

So I'm like, yeah, 51 is Aubrey Clayton.

184

00:14:39,237 --> 00:14:39,997

Yeah.

185

00:14:39,997 --> 00:14:40,957

my goodness.

186

00:14:40,957 --> 00:14:42,217

That's crazy.

187

00:14:42,217 --> 00:14:44,697

That's also how my brain works.

188

00:14:44,697 --> 00:14:45,977

numbers.

189

00:14:46,977 --> 00:14:47,787

But yeah.

190

00:14:47,787 --> 00:14:52,327

And actually episode 50 was with Sir David Spiegel Halter.

191

00:14:52,327 --> 00:14:59,717

I think the only night we got on the podcast and David Spiegel Halter exceptional.

192

00:15:00,165 --> 00:15:04,006

exceptional guest, very, very good pedagogically.

193

00:15:04,006 --> 00:15:15,830

Definitely recommend listening to that episode two, which is very epistemologically heavy

so far for people who like that, the history of science, how we got there.

194

00:15:15,830 --> 00:15:20,971

Because as you were saying, is actually older than Stantz, but people discovered later.

195

00:15:20,971 --> 00:15:23,372

So it's not because it's older, that's better, right?

196

00:15:23,372 --> 00:15:27,593

But it is way older actually by a few centuries.

197

00:15:28,823 --> 00:15:31,074

So yeah, fun stories here.

198

00:15:31,074 --> 00:15:42,899

could talk about that still, but to get back to what you were saying, also as you were

very eloquently saying, data can definitely be biased.

199

00:15:43,320 --> 00:15:47,121

Because that idea of like, no, we only want the data to speak for themselves.

200

00:15:47,401 --> 00:15:50,323

as I was saying, yeah, what if the data are unreliable?

201

00:15:50,323 --> 00:15:52,884

But as you were saying, what if the data are biased?

202

00:15:52,884 --> 00:15:54,705

And that happens all the time.

203

00:15:54,705 --> 00:15:55,969

And worse.

204

00:15:55,969 --> 00:16:06,517

I would say these biases are most of the time implicit in the sense that either they are

hidden or most of the time they just like you don't even know you are biased in some

205

00:16:06,517 --> 00:16:12,922

direction most of the time because it's a result of your education and your environment.

206

00:16:12,922 --> 00:16:21,919

So the good thing of priors is that it forces your assumptions, your hidden assumptions to

be explicit.

207

00:16:22,310 --> 00:16:32,638

And that I think is very interesting also, especially when you work on models which are

supposed to have a causal explanation and which are not physical models, but more social

208

00:16:32,638 --> 00:16:36,281

models or political scientific models.

209

00:16:36,281 --> 00:16:44,227

Well, then it's really interesting to see how two people can have different conclusions

based on the same data.

210

00:16:44,227 --> 00:16:45,949

It's because they have different priors.

211

00:16:45,949 --> 00:16:50,733

And if you force them to explicit these priors in their models, they would definitely have

different priors.

212

00:16:50,733 --> 00:16:51,265

then...

213

00:16:51,265 --> 00:16:55,146

then you can have a more interesting discussion actually, think.

214

00:16:55,727 --> 00:16:56,807

So there's that.

215

00:16:56,807 --> 00:17:07,932

And then I think the last point that's interesting also in that, like why would you be

interested in this framework is that also, causes are not in the data.

216

00:17:08,292 --> 00:17:11,243

Causes are outside of the data.

217

00:17:11,243 --> 00:17:19,256

The causal relation between X and Y, you're not going to see it in the data because if you

do a regression of

218

00:17:21,087 --> 00:17:26,801

education on income, you're going to see an effect of education on income.

219

00:17:26,982 --> 00:17:35,128

But you as a human, you know that if you're looking at one person, the effect has to be

education has an impact on income.

220

00:17:35,549 --> 00:17:44,376

But the computer could like might as well just do the other regression and regress income

and education and tell you, income causes education.

221

00:17:44,396 --> 00:17:47,008

But no, it's not going that way.

222

00:17:47,008 --> 00:17:50,771

the statistical relationship goes both ways, but the causal one

223

00:17:50,859 --> 00:17:52,790

only goes one direction.

224

00:17:54,272 --> 00:17:58,435

And that's a hidden reference to my favorite music band.

225

00:17:58,435 --> 00:18:03,999

But yeah, it only goes one direction, and it's not in the data.

226

00:18:04,540 --> 00:18:06,461

And you have to have a model for that.

227

00:18:06,461 --> 00:18:10,024

And a model is just a simplification of reality.

228

00:18:10,024 --> 00:18:16,989

We try to get the simple enough model that's usually not simple, but that's a

simplification.

229

00:18:18,239 --> 00:18:21,930

If you say it's a construction and simplification, that's already a prior in a way.

230

00:18:21,930 --> 00:18:26,251

you you might as well just go all the way and explicit all your priors.

231

00:18:26,251 --> 00:18:27,071

Well said.

232

00:18:27,071 --> 00:18:29,642

Very interesting discussion there.

233

00:18:29,642 --> 00:18:36,554

You used a term a number of times already in today's podcast, which maybe is not known to

all of our listeners.

234

00:18:36,554 --> 00:18:37,604

is epistemology?

235

00:18:37,604 --> 00:18:38,814

What does that mean?

236

00:18:38,935 --> 00:18:39,555

right.

237

00:18:39,555 --> 00:18:41,175

Yeah, very good question.

238

00:18:41,375 --> 00:18:41,705

Yeah.

239

00:18:41,705 --> 00:18:46,807

So epistemology in like, in a sense that's the science of science.

240

00:18:46,807 --> 00:18:48,437

It's understanding

241

00:18:48,525 --> 00:18:53,628

how we know what we say we know.

242

00:18:53,628 --> 00:18:59,611

So, for instance, how do we know the earth is round?

243

00:19:00,272 --> 00:19:02,534

How do we know about relativity?

244

00:19:02,534 --> 00:19:03,774

Things like that.

245

00:19:03,774 --> 00:19:08,697

it's the scientific discipline that's actually very close to also philosophy.

246

00:19:08,697 --> 00:19:11,419

That's, think, actually a branch of philosophy.

247

00:19:11,999 --> 00:19:14,961

And that's trying to...

248

00:19:15,725 --> 00:19:21,028

come up with methods to understand how we can come up with new scientific knowledge.

249

00:19:21,028 --> 00:19:27,631

And by scientific here, we usually mean reliable and reproducible, but also falsifiable.

250

00:19:28,012 --> 00:19:33,234

Because for hypothesis to be scientific, it has to be falsifiable.

251

00:19:33,955 --> 00:19:36,116

so yeah, basically that's that.

252

00:19:36,236 --> 00:19:45,311

Lots of extremely interesting things here, but yeah, that's like basically how do we know?

253

00:19:45,803 --> 00:19:54,117

what we know, that's the whole trying to define the scientific method and things like

that.

254

00:19:54,117 --> 00:20:01,621

Going off on a little bit of a tangent here, but it's interesting to me how I think among

non -scientists lay people in the public.

255

00:20:03,309 --> 00:20:12,669

Science is often seen to be infallible, as though science is real.

256

00:20:12,669 --> 00:20:14,189

Science is the truth.

257

00:20:14,189 --> 00:20:30,283

There's a lot of, since that 2016 election, there are lots of, people have lawn signs in

the US that say, that basically have a list of liberal values, most of which

258

00:20:30,283 --> 00:20:31,223

I'm a huge fan of.

259

00:20:31,223 --> 00:20:36,646

And of course I like the sentiment, this idea, you know, that they're supporting science

on this sign, on the sign as well.

260

00:20:36,646 --> 00:20:41,597

But it says the sign, the way that they phrase it is science is real.

261

00:20:43,598 --> 00:20:52,472

And the implication there for me, every time I see the sign is that, you know, and I think

that could be, for example, related to vaccines, I think, you know, around, you know,

262

00:20:52,472 --> 00:20:59,344

there was a lot of conflict around vaccines and what their real purpose is and, you know,

and then so.

263

00:21:00,321 --> 00:21:05,343

the lay liberal person is like, you know, this is science, know, trust science, it's real.

264

00:21:05,343 --> 00:21:17,058

Whereas from the inside, it's, you you pointed it out already there, but it's this

interesting irony that the whole point of science is that we're saying, I'm, I'm, I'm

265

00:21:17,058 --> 00:21:18,749

never confident of anything.

266

00:21:18,749 --> 00:21:22,110

I'm always open to this being wrong.

267

00:21:22,171 --> 00:21:22,551

Yeah.

268

00:21:22,551 --> 00:21:22,941

Yeah.

269

00:21:22,941 --> 00:21:24,992

No, exactly.

270

00:21:24,992 --> 00:21:28,213

and I think that's, that's kind of the distinction.

271

00:21:28,449 --> 00:21:37,732

That's often made in epistemology actually between science on one hand and research on the

other end, where research is science in the making.

272

00:21:37,732 --> 00:21:51,605

Science is like the collective knowledge that we've accumulated since basically the

beginning of modern science, at least in the Western hemisphere, so more or less during

273

00:21:51,605 --> 00:21:53,816

the Renaissance.

274

00:21:53,816 --> 00:21:58,017

Then research is people making that science because...

275

00:21:58,017 --> 00:22:00,959

people have to do that and how do we come up with that?

276

00:22:00,959 --> 00:22:09,353

so, yeah, like definitely I'm one who always emphasizes the fact that, yeah, now we know

the Earth is round.

277

00:22:09,353 --> 00:22:14,065

We know how to fly planes, but there was a moment we didn't.

278

00:22:14,766 --> 00:22:16,667

And so how do we come up with that?

279

00:22:16,667 --> 00:22:27,083

And actually, maybe one day we'll discover that we were doing it kind of the wrong way,

you know, flying planes, but it's just like, for now it works.

280

00:22:27,083 --> 00:22:27,745

We have the

281

00:22:27,745 --> 00:22:33,367

best model that we can have right now with our knowledge.

282

00:22:33,367 --> 00:22:39,388

But maybe one day we'll discover that there is a way better way to fly.

283

00:22:39,508 --> 00:22:47,530

And it was just there staring at us and it took years for us to understand how to do that.

284

00:22:47,530 --> 00:22:56,973

yeah, like as you were saying, but that's really hard line to walk because you have to

say, yeah.

285

00:22:56,973 --> 00:23:10,753

Like these knowledge, these facts are really trustworthy, but you can never trust

something 100 % because otherwise mathematically, if you go back to base formula, you

286

00:23:10,753 --> 00:23:14,073

actually cannot update your knowledge.

287

00:23:14,073 --> 00:23:26,957

you, if you have a 0 % prior or 1 % prior, like mathematically, you cannot apply base

formula, which tells you, well, based on new data that you just observed the most

288

00:23:26,957 --> 00:23:33,757

rational way of updating your belief is to believe that with that certainty.

289

00:23:33,797 --> 00:23:38,437

But if you have zero or 100%, it's never going to be updated.

290

00:23:38,437 --> 00:23:44,577

So you can say 99 .9999 % that what we're doing right now by flying is really good.

291

00:23:44,617 --> 00:23:47,377

But maybe, like, you never know.

292

00:23:47,377 --> 00:23:50,017

There is something that will appear.

293

00:23:50,017 --> 00:23:51,297

And physics is a real...

294

00:23:51,297 --> 00:23:52,967

We've all seen UFOs, Alex.

295

00:23:52,967 --> 00:23:54,587

We know that there's better ways to fly.

296

00:23:54,587 --> 00:23:55,347

Yeah.

297

00:23:55,347 --> 00:23:55,917

Exactly.

298

00:23:55,917 --> 00:24:09,304

Yeah, but yeah, think physics is actually a really good field for that because it's always

evolving and it's always coming up with really completely crazy paradigm shifting

299

00:24:10,025 --> 00:24:20,091

explanation like relativity, special relativity, then general relativity just a century

ago that didn't exist.

300

00:24:20,091 --> 00:24:25,183

And now we start to understand a bit better, but even now we don't really understand how

301

00:24:25,325 --> 00:24:30,445

how to blend relativity and gravity.

302

00:24:30,445 --> 00:24:35,105

so that's extremely interesting to me.

303

00:24:35,105 --> 00:24:48,545

But yeah, I understand that politically from a marketing standpoint, it's hard to sell,

but I think it's shooting yourself in the foot if you're saying, yeah, is always like,

304

00:24:48,545 --> 00:24:54,765

science works, I agree, science works, but it doesn't have to be 100 % true and sure.

305

00:24:54,765 --> 00:24:56,365

for it to work.

306

00:24:56,445 --> 00:24:58,724

That's why placebo works.

307

00:24:58,724 --> 00:25:00,665

Placebos work, right?

308

00:25:00,825 --> 00:25:10,125

It's just something that works even though it doesn't have any actual concrete evidence

that it's adding something, but it works.

309

00:25:10,485 --> 00:25:16,565

yeah, like I think it's really shooting yourself in the foot by saying that no, that's

100%.

310

00:25:16,565 --> 00:25:21,125

Like if you question science, then you're anti -science.

311

00:25:21,125 --> 00:25:21,773

No.

312

00:25:21,773 --> 00:25:26,075

Actually, it's the whole scientific methods to be able to ask questions all the time.

313

00:25:26,075 --> 00:25:28,436

The question is how do you do that?

314

00:25:28,436 --> 00:25:34,478

Do you apply the scientific method to your questions or do you just question anything like

that without any method?

315

00:25:34,478 --> 00:25:40,080

And just because you fancy questioning that because it goes against your belief to begin

with.

316

00:25:40,521 --> 00:25:41,611

So yeah, that's one thing.

317

00:25:41,611 --> 00:25:51,765

And then I think another thing that you said I think is very interesting is,

unfortunately, I think the way of teaching science and communicating around it,

318

00:25:51,997 --> 00:25:54,699

is not very incarnated.

319

00:25:56,020 --> 00:25:57,181

It's quite dry.

320

00:25:57,181 --> 00:26:00,864

You just learn equations and you just learn that stuff.

321

00:26:00,864 --> 00:26:08,379

Whereas science was made by people and is made by people who have their biases, who have

extremely violent conflicts.

322

00:26:08,379 --> 00:26:14,634

Like you were saying, Fisher was just a huge jerk to everybody around him.

323

00:26:14,634 --> 00:26:18,856

I think it would be interesting to

324

00:26:19,585 --> 00:26:27,007

get back to a bit of that human side to make science less dry and also less intimidating

thanks to that.

325

00:26:27,007 --> 00:26:36,190

Because most of the time when I tell people what I do for a living, they get super

intimidated and they're like, my God, yeah, I hate math, I hate stats and stuff.

326

00:26:36,190 --> 00:26:38,190

But it's just numbers.

327

00:26:38,190 --> 00:26:41,511

It's just a language.

328

00:26:41,511 --> 00:26:42,452

it's a bit dry.

329

00:26:42,452 --> 00:26:48,053

For instance, if there is someone who is into movies, who does movies in your audience.

330

00:26:48,363 --> 00:26:52,875

I want to know why there is no movie about Albert Einstein.

331

00:26:52,875 --> 00:26:55,636

There has to be a movie about Albert Einstein.

332

00:26:55,636 --> 00:27:00,818

Like not only huge genius, but like extremely interesting life.

333

00:27:00,818 --> 00:27:03,329

Like honestly, it makes for great movie.

334

00:27:03,329 --> 00:27:06,921

was working in a a dramatized biopic.

335

00:27:06,921 --> 00:27:07,461

mean?

336

00:27:07,461 --> 00:27:07,781

Yeah.

337

00:27:07,781 --> 00:27:08,081

Yeah.

338

00:27:08,081 --> 00:27:10,802

I mean, it's like his life is super interesting.

339

00:27:10,802 --> 00:27:17,865

Like he revolutionized the field of two fields of physics and actually chemistry.

340

00:27:18,221 --> 00:27:34,621

In 1905, it's like his big year, and he came up with the ideas for relativity while

working at the patent bureau in Bern in Switzerland, which was an extremely boring job.

341

00:27:35,601 --> 00:27:39,101

In his words, it was an extremely boring job.

342

00:27:39,201 --> 00:27:45,161

Basically, having that boring job allowed him to do that being completely outside of the

academic circles and so on.

343

00:27:45,259 --> 00:27:46,830

It's like he makes for a perfect movie.

344

00:27:46,830 --> 00:27:48,331

I don't understand why it's not there.

345

00:27:48,331 --> 00:27:50,171

And then I sing on the cake.

346

00:27:50,672 --> 00:27:54,053

He had a lot of women in his life.

347

00:27:54,053 --> 00:27:56,254

So it's like, you know, it's perfect.

348

00:27:56,254 --> 00:28:04,898

Like you have you have the sex you have, you have the drama, you have revolutionizing the

field, you have Nobel prizes.

349

00:28:04,939 --> 00:28:08,880

And he and then he became a like a pop icon.

350

00:28:08,940 --> 00:28:10,571

I don't know where the movies.

351

00:28:10,635 --> 00:28:11,936

Yeah, it is wild.

352

00:28:11,936 --> 00:28:16,047

Actually, now that you pointed out, it's kind of surprising that there aren't movies about

him all the time.

353

00:28:16,748 --> 00:28:18,188

Like Spider -Man.

354

00:28:19,069 --> 00:28:20,630

Yeah, I agree.

355

00:28:20,630 --> 00:28:23,190

Well, there was one about Oppenheimer last year.

356

00:28:23,591 --> 00:28:25,382

Maybe that started to trend.

357

00:28:25,382 --> 00:28:26,472

see.

358

00:28:26,472 --> 00:28:28,573

Yeah.

359

00:28:28,573 --> 00:28:37,761

So in addition to the podcast, you also, I mentioned this at the outset, I said that your

co -founder and principal data scientist of

360

00:28:37,761 --> 00:28:41,642

the popular Bayesian stats modeling platform, PyMC.

361

00:28:41,642 --> 00:28:46,923

So like many things in data science, it's uppercase P, lowercase y for Python.

362

00:28:47,424 --> 00:28:51,125

What's the MC, PyMC, one word, and C are capitalized.

363

00:28:51,125 --> 00:28:51,785

Yeah.

364

00:28:51,785 --> 00:29:00,007

So it's very confusing because it stands for Python and then MC is Monte Carlo.

365

00:29:01,488 --> 00:29:03,048

So I understand.

366

00:29:03,408 --> 00:29:05,409

But why Monte Carlo?

367

00:29:06,219 --> 00:29:08,760

It's because it comes from Markov chain Monte Carlo.

368

00:29:08,760 --> 00:29:13,453

So actually it should be pie MCMC or pie MC squared, which is what I'm saying since the

beginning.

369

00:29:13,453 --> 00:29:17,915

anyways, yeah, it's actually, it's actually buying C squared.

370

00:29:18,076 --> 00:29:32,874

so for Markov chain Monte Carlo and Markov chain Monte Carlo is one of the main ways that

all of their algorithms now, new ones, but like the blockbuster algorithm to run a patient

371

00:29:32,874 --> 00:29:35,009

models is to use MCMC.

372

00:29:35,009 --> 00:29:35,279

Yeah.

373

00:29:35,279 --> 00:29:46,917

So in the same way that stochastic gradient descent is like the de facto standard for

finding your model weights in machine learning, Markov chain Monte Carlo is kind of the

374

00:29:46,917 --> 00:29:49,999

standard way of doing it with a Bayesian network.

375

00:29:50,080 --> 00:29:50,500

Yeah.

376

00:29:50,500 --> 00:29:50,770

Yeah.

377

00:29:50,770 --> 00:29:51,160

Yeah.

378

00:29:51,160 --> 00:29:55,924

And, so now there are newer versions, more efficient versions.

379

00:29:55,924 --> 00:29:58,225

That's, that's basically the name of the game, right?

380

00:29:58,225 --> 00:30:01,417

Making the efficient, the algorithm more and more efficient.

381

00:30:01,778 --> 00:30:03,361

but the first algorithm.

382

00:30:03,361 --> 00:30:08,324

days back, I think it was actually invented during the project Manhattan.

383

00:30:08,345 --> 00:30:10,406

during the world during World War Two.

384

00:30:10,406 --> 00:30:11,987

Game of the day.

385

00:30:11,987 --> 00:30:12,708

Yeah.

386

00:30:12,708 --> 00:30:21,573

And lots of physicists actually, statistical physics is a film that's contributed a lot to

MCMC.

387

00:30:21,894 --> 00:30:30,580

so yeah, like physicists who came to the field of statistics and trying to make the

algorithms more efficient for their models.

388

00:30:30,580 --> 00:30:32,503

And yeah, so they buy

389

00:30:32,503 --> 00:30:34,584

They have contributed a lot.

390

00:30:34,584 --> 00:30:46,201

The field of physics has contributed a lot of big names and people to great leaps into the

realm of more efficient algorithms.

391

00:30:46,201 --> 00:30:50,393

I don't know who your audience is, but that may sound boring.

392

00:30:50,393 --> 00:30:54,085

Yeah, the algorithm, it's like the workhorse.

393

00:30:54,165 --> 00:30:57,677

But it's extremely powerful.

394

00:30:57,677 --> 00:31:01,929

And that's also one of the main reasons why patients' statistics are

395

00:31:02,453 --> 00:31:05,802

increasing in popularity lately because

396

00:31:07,371 --> 00:31:16,695

I'm going to argue that it's always been the best framework to do statistics, to do

science, but it was hard to do with pen and paper because the problem is that you have a

397

00:31:16,695 --> 00:31:23,828

huge nasty integral on the numerator, on the denominator, sorry.

398

00:31:23,868 --> 00:31:28,560

And this integral is not computable by pen and paper.

399

00:31:28,560 --> 00:31:36,973

So for a long, long time, patient statistics combined to features, you know, like

campaigns.

400

00:31:37,469 --> 00:31:44,495

PR campaigns, patients S6 was relegated to the margins because it was just super hard to

do.

401

00:31:44,856 --> 00:31:51,101

so for other problems, other than very trivial ones, it was not very applicable.

402

00:31:51,101 --> 00:32:01,530

But now with the advent of personal computing, you have these incredible algorithms like,

so now most of time it's HMC, Hamiltonian Monte Carlo.

403

00:32:01,530 --> 00:32:04,887

That's what we use under the hood with PIMC.

404

00:32:04,887 --> 00:32:08,358

But if you use Stan, if you use NumPyro, it's the same.

405

00:32:09,159 --> 00:32:21,844

And thanks to these algorithms, now we can make extremely powerful models because we can

approximate the posterior distributions thanks to, well, computing power.

406

00:32:21,844 --> 00:32:25,506

A computer is very good at computing.

407

00:32:25,506 --> 00:32:27,506

I think that's why it's called that.

408

00:32:29,288 --> 00:32:29,868

Yes.

409

00:32:29,868 --> 00:32:33,319

And so that reminds me of deep learning.

410

00:32:34,218 --> 00:32:46,628

It's a similar kind of thing where the applications we have today, like your chat GPT or

whatever your favorite large language model is these amazing video generation like Sora,

411

00:32:46,628 --> 00:32:55,646

all of this is happening thanks to deep learning, which is an approach we've had since the

fifties, certainly not as old as Bayesian statistics, but similarly it has been able to

412

00:32:55,646 --> 00:33:00,179

take off with much larger data sets and much more compute.

413

00:33:00,260 --> 00:33:00,700

Yeah.

414

00:33:00,700 --> 00:33:01,560

Yeah.

415

00:33:01,701 --> 00:33:02,113

Yeah.

416

00:33:02,113 --> 00:33:02,883

Yeah, very good point.

417

00:33:02,883 --> 00:33:05,045

And I think that's even more the point in deep learning.

418

00:33:05,045 --> 00:33:05,405

for sure.

419

00:33:05,405 --> 00:33:12,039

Because Beijing stats doesn't need the scale, but the way we're doing deep learning for

now definitely need the scale.

420

00:33:12,039 --> 00:33:12,549

Yeah, yeah.

421

00:33:12,549 --> 00:33:13,719

Scale of data.

422

00:33:13,760 --> 00:33:14,540

Yeah, exactly.

423

00:33:14,540 --> 00:33:15,100

Yeah, sorry.

424

00:33:15,100 --> 00:33:15,801

Yeah, the scale.

425

00:33:15,801 --> 00:33:18,042

Because there two scales, data and...

426

00:33:18,042 --> 00:33:18,912

Yeah, you're right.

427

00:33:18,912 --> 00:33:21,163

Yeah, and for like model parameters.

428

00:33:21,184 --> 00:33:28,798

And so that has actually, I mean, tying back to something you said near the beginning of

this episode is that actually one of the advantages of Beijing statistics is that you can

429

00:33:28,798 --> 00:33:30,629

do it with very few data.

430

00:33:30,629 --> 00:33:31,293

Yeah.

431

00:33:31,293 --> 00:33:36,535

maybe fewer data than with a frequentist approach or machine learning approach.

432

00:33:36,535 --> 00:33:45,597

Because you can bake in your prior assumptions and those prior assumptions give some kind

of structure, some kind of framework for your data to make an impact through.

433

00:33:45,977 --> 00:33:47,977

Yeah, completely.

434

00:33:48,058 --> 00:33:56,450

So for our listeners who are listening right now, if they are keen to try out Bayesian

statistics for the first time, why should they reach for PyMC?

435

00:33:56,450 --> 00:34:00,651

Which, as far as I know, is the most used.

436

00:34:01,741 --> 00:34:04,523

Bayesian framework, period.

437

00:34:04,523 --> 00:34:06,483

And certainly in Python.

438

00:34:07,704 --> 00:34:10,445

and then the second I'm sure is Stan.

439

00:34:10,865 --> 00:34:11,285

yeah.

440

00:34:11,285 --> 00:34:11,886

Yeah.

441

00:34:11,886 --> 00:34:22,510

And, so, yeah, why, should somebody use pyMc and maybe even more generally, how can they

get started if they haven't done any Bayesian statistics before at all?

442

00:34:22,530 --> 00:34:22,860

Yeah.

443

00:34:22,860 --> 00:34:23,130

Yeah.

444

00:34:23,130 --> 00:34:23,440

Yeah.

445

00:34:23,440 --> 00:34:24,381

Fantastic question.

446

00:34:24,381 --> 00:34:29,293

I think it's a, yeah, it's a very good one because, that can also be very intimidating.

447

00:34:30,285 --> 00:34:33,287

And actually that can be a paradox of choice.

448

00:34:33,287 --> 00:34:42,332

know, where now we're lucky to live in a world where we actually have a lot of

probabilistic programming languages.

449

00:34:42,332 --> 00:34:47,675

So you'll see that sometimes that's called PPL and that's what's a PPL.

450

00:34:47,675 --> 00:34:48,826

It's basically PINC.

451

00:34:48,826 --> 00:34:54,199

It's a software that enables you to write down Bayesian models and sample from them.

452

00:34:54,199 --> 00:34:54,609

Okay.

453

00:34:54,609 --> 00:34:56,960

So it's just a fancy word to say that.

454

00:34:59,509 --> 00:35:02,590

Yeah, my main advice is don't overthink it.

455

00:35:02,710 --> 00:35:21,795

Like if you're proficient in R, then probably you want to try, I would definitely

recommend trying BRMS first because it's built on top of Stan and Stan is extremely good.

456

00:35:22,156 --> 00:35:27,029

It's built by extremely good modelers and statisticians.

457

00:35:27,029 --> 00:35:28,801

Lots of them have been on my podcast.

458

00:35:28,801 --> 00:35:34,165

So if you're curious, just, just go there and you go on the website, you look for Stan and

you'll get a lot of them.

459

00:35:34,956 --> 00:35:40,251

the best one is most of the time, Andrew Gellman, absolutely amazing to have him on the

show.

460

00:35:40,251 --> 00:35:43,443

He, he always explains stuff extremely clearly.

461

00:35:43,534 --> 00:35:47,016

but I also had Bob Carpenter, for instance, Matt Hoffman.

462

00:35:47,667 --> 00:35:49,099

so anyways, if you know, or.

463

00:35:49,099 --> 00:35:50,609

yeah.

464

00:35:50,930 --> 00:35:54,393

Have you ever had Rob Tran Gucci on the show, or do you know who he is?

465

00:35:54,701 --> 00:35:58,221

I know, but I have never had him on the show.

466

00:35:58,221 --> 00:35:59,801

So, but I'd be happy to.

467

00:35:59,801 --> 00:36:00,041

Yeah.

468

00:36:00,041 --> 00:36:02,541

If you know him, I'll make an introduction for you.

469

00:36:02,541 --> 00:36:05,591

He was on our show in episode number 507.

470

00:36:05,591 --> 00:36:09,781

And that was our first ever Beijing episode.

471

00:36:10,001 --> 00:36:15,941

And it was the most popular episode of that year, 2021, the most popular episode.

472

00:36:15,941 --> 00:36:24,065

And it was interesting because also up until that time, at least with me hosting, 2021 was

my first year hosting the show.

473

00:36:24,077 --> 00:36:27,797

And it was by far our longest episode.

474

00:36:28,537 --> 00:36:30,757

I was like, that was kind of concerning for me.

475

00:36:30,757 --> 00:36:34,377

was like, this was a super technical episode, super long.

476

00:36:34,377 --> 00:36:35,877

I was like, how is this going to resonate?

477

00:36:35,877 --> 00:36:38,237

It turns out that's what our audience loves.

478

00:36:38,237 --> 00:36:44,217

And that's something we've been leaning into a bit in 2024 is more technical, longer.

479

00:36:45,397 --> 00:36:46,597

Well, that's good to know.

480

00:36:46,597 --> 00:36:47,117

Yeah.

481

00:36:47,117 --> 00:36:48,827

I'll make an intro for Rob.

482

00:36:48,827 --> 00:36:50,997

Anyway, you were saying I could do an intro for you.

483

00:36:50,997 --> 00:36:51,727

Yeah, I know.

484

00:36:51,727 --> 00:36:52,589

But yeah.

485

00:36:52,589 --> 00:36:55,090

Great, great interruption for sure.

486

00:36:55,370 --> 00:36:58,842

I'm happy to have that introduction made.

487

00:36:58,842 --> 00:36:59,932

a lot.

488

00:37:00,472 --> 00:37:04,744

Yeah, so I was saying, if you're proficient in R, definitely give a try to BRMS.

489

00:37:04,744 --> 00:37:05,875

It's built on top of Stan.

490

00:37:05,875 --> 00:37:09,396

Then when you outgrow BRMS, go to Stan.

491

00:37:10,356 --> 00:37:15,558

If you love Stan, but you're using Python, there is PyStan.

492

00:37:17,039 --> 00:37:21,651

I've never used that personally, but I'm pretty sure it's good.

493

00:37:23,005 --> 00:37:32,989

and then, but I would say if you're proficient in Python and don't really want to go to R

then yeah, like you probably want to give a try to, to PIMC or to NumPyro.

494

00:37:33,350 --> 00:37:43,504

You know, give that a try, see what, what resonate most with you, the, API most of the

time, because if you're going to make models like that, you're going to spend a lot of

495

00:37:43,504 --> 00:37:47,275

time on your code and on your models.

496

00:37:47,275 --> 00:37:52,157

And, as most of your audience probably know, like

497

00:37:52,311 --> 00:37:56,584

The models always fail unless it's the last one.

498

00:37:56,584 --> 00:38:04,049

So, yeah, you have to love really the framework you're using and find it intuitive.

499

00:38:04,049 --> 00:38:07,511

Otherwise, it's going to be hard to keep it going.

500

00:38:07,792 --> 00:38:18,699

If you're really, really a beginner, I would also recommend on the Python realm, give a

try to Bambi, which is the equivalent of BRMS, but in Python.

501

00:38:18,699 --> 00:38:21,761

So Bambi is built on top of Climacy.

502

00:38:21,823 --> 00:38:26,317

And what it does, it does a lot of the choices for you.

503

00:38:26,317 --> 00:38:28,569

It makes a lot of the choices for you under the hood.

504

00:38:28,569 --> 00:38:36,865

So priors, stuff like that, which can be a bit overwhelming to beginners at the beginning.

505

00:38:36,885 --> 00:38:42,830

But then when you outgrow Bambi, you want to make more complicated models, then go to

BiMC.

506

00:38:43,271 --> 00:38:51,677

Bambi, that's a really cute name for a model that's just like, it just drops out of its

mother and can barely stand up straight.

507

00:38:52,502 --> 00:38:58,536

Yeah, And the guys working on Bambi, so Tommy Capretto, Osvaldo Martino.

508

00:38:58,536 --> 00:39:01,467

So they are like, yeah, really great guys.

509

00:39:02,048 --> 00:39:04,249

Both Argentinians, actually.

510

00:39:06,031 --> 00:39:08,493

And yeah, like they are fun guys.

511

00:39:08,493 --> 00:39:12,976

I think the website for Bambi is bambinos .github .com.

512

00:39:12,976 --> 00:39:14,737

yeah, like these guys.

513

00:39:15,297 --> 00:39:16,138

These guys are fun.

514

00:39:16,138 --> 00:39:20,777

But yeah, it's definitely a great framework.

515

00:39:20,777 --> 00:39:29,699

And actually, this week, we released with Tommy Capretto and Ravin Kumar.

516

00:39:29,720 --> 00:39:35,741

We actually released an online course, our second online course that we've been working on

for two years.

517

00:39:35,741 --> 00:39:38,702

So we are very happy to have released it.

518

00:39:39,562 --> 00:39:41,383

But we're also very happy with the course.

519

00:39:41,383 --> 00:39:42,703

That's why it took so long.

520

00:39:42,703 --> 00:39:43,964

It's a very big course.

521

00:39:43,964 --> 00:39:46,734

And that's exactly what we do.

522

00:39:46,734 --> 00:39:49,229

We take you from beginner.

523

00:39:49,229 --> 00:39:53,610

We teach you Bambi, teach you Pinsene and you go up until advanced.

524

00:39:53,610 --> 00:39:54,941

That's called advanced regression.

525

00:39:54,941 --> 00:39:58,071

So we teach you like all things regression.

526

00:39:58,071 --> 00:39:59,812

What's the course called?

527

00:40:00,272 --> 00:40:01,632

Advanced regression.

528

00:40:01,632 --> 00:40:02,832

Yeah.

529

00:40:03,273 --> 00:40:09,054

Advanced regression on the intuitive base platform that you were kind enough to mention at

the beginning.

530

00:40:09,054 --> 00:40:09,394

Nice.

531

00:40:09,394 --> 00:40:09,574

Yeah.

532

00:40:09,574 --> 00:40:11,425

I'll be sure to include that in the show notes.

533

00:40:11,425 --> 00:40:19,457

And so even though it's called advanced regression, you start us off with an introduction

to Bayesian statistics and we start getting our

534

00:40:19,645 --> 00:40:22,327

with Bambi before moving on to PyMC, yeah?

535

00:40:22,327 --> 00:40:23,968

Yeah, yeah, yeah.

536

00:40:23,968 --> 00:40:26,089

So you have a regression refresher at the beginning.

537

00:40:26,089 --> 00:40:32,562

If you're a complete, complete beginner, then I would recommend taking our intro course

first, which is really here.

538

00:40:32,562 --> 00:40:34,733

It's really from the ground up.

539

00:40:34,894 --> 00:40:42,018

The advanced regression course, well, ideally you would do that after the intro course.

540

00:40:42,018 --> 00:40:45,870

But if you're already there in your learning curve, then you can start with the intro

course.

541

00:40:45,870 --> 00:40:47,740

It makes a bit more assumption on

542

00:40:48,597 --> 00:40:53,459

on the student's part, like, yeah, they have heard about Bayesian stats.

543

00:40:53,920 --> 00:40:58,923

They are aware of the ideas of priors, likelihood, posteriors.

544

00:40:59,203 --> 00:41:02,025

But we give you a refresher about the classic progression.

545

00:41:02,025 --> 00:41:05,847

it's like when you have a normal likelihood.

546

00:41:05,847 --> 00:41:12,010

And then we teach you how to generalize that framework to data that's not normally

distributed.

547

00:41:12,010 --> 00:41:14,182

And we start with BAMBEE.

548

00:41:14,182 --> 00:41:16,873

We show you how to do the equivalent models in PIMEC.

549

00:41:16,885 --> 00:41:23,791

And then at the end, the model became, becomes, become like much more complicated, then we

just show it in point C.

550

00:41:24,252 --> 00:41:24,892

Nice.

551

00:41:24,892 --> 00:41:28,285

That is super, super cool.

552

00:41:28,285 --> 00:41:31,117

I hope to be able to find time to dig into that myself soon.

553

00:41:31,117 --> 00:41:32,368

It's one of those things.

554

00:41:32,739 --> 00:41:33,179

yeah.

555

00:41:33,179 --> 00:41:42,557

You and I were lamenting this before the show, podcasting of itself can take up so much

time on top of, in both of our cases, we have full -time jobs.

556

00:41:42,557 --> 00:41:44,789

This is something that we're doing as a hobby.

557

00:41:45,101 --> 00:41:53,908

And it means that I'm constantly talking to amazingly interesting people like you who have

developed fascinating courses that I want to be able to study.

558

00:41:53,908 --> 00:41:55,910

And it's like, when am going to do that?

559

00:41:55,910 --> 00:41:57,521

Like book recommendations alone.

560

00:41:57,521 --> 00:41:59,533

Like I barely get to read books anymore.

561

00:41:59,533 --> 00:42:02,686

That was something like since basically the pandemic hit.

562

00:42:02,686 --> 00:42:09,501

I, and it's, it's so embarrassing for me because I, I identify in my mind as a book

reader.

563

00:42:09,619 --> 00:42:11,640

And sometimes I even splurge.

564

00:42:11,640 --> 00:42:15,221

I'm like, wow, I've got to get like these books that I absolutely must read.

565

00:42:15,221 --> 00:42:18,273

And they just collect in stacks around my apartment.

566

00:42:18,273 --> 00:42:20,563

Like, yeah.

567

00:42:20,563 --> 00:42:21,054

Yeah.

568

00:42:21,054 --> 00:42:21,684

Yeah.

569

00:42:21,684 --> 00:42:23,274

I mean, that's hard for sure.

570

00:42:23,475 --> 00:42:29,137

yeah, it's something I've also been trying to, get under control a bit.

571

00:42:29,657 --> 00:42:39,691

yeah, like I find, so a guy who does good work, I find on that is, can you port

572

00:42:40,513 --> 00:42:43,235

Yes, Cal Newport, of course.

573

00:42:43,235 --> 00:42:45,206

I've been collecting his books too.

574

00:42:45,206 --> 00:42:48,698

Yeah, that's the irony.

575

00:42:48,698 --> 00:42:51,019

So he's got a podcast.

576

00:42:51,019 --> 00:42:54,581

don't know about you, but me, I listen to tons of podcasts.

577

00:42:55,121 --> 00:42:57,112

So the audio format is really something I love.

578

00:42:57,112 --> 00:42:59,064

So podcasts and audio books.

579

00:42:59,064 --> 00:43:01,665

yeah, that can be your entrance here.

580

00:43:01,665 --> 00:43:04,837

Maybe you can listen to more books if you don't have time to write.

581

00:43:04,837 --> 00:43:06,707

Yeah, it's an interesting...

582

00:43:09,153 --> 00:43:10,774

don't really have a commute.

583

00:43:11,364 --> 00:43:19,300

and I often, use like, you know, when I'm traveling to the airport or something, I use

that as an opportunity to like do catch up calls and that kind of thing.

584

00:43:19,300 --> 00:43:20,641

So it's interesting.

585

00:43:20,641 --> 00:43:24,944

I, I, I almost listened to no other podcasts.

586

00:43:25,584 --> 00:43:27,896

The only show I listened to is last week in AI.

587

00:43:27,896 --> 00:43:29,747

I don't know if you know that show.

588

00:43:29,747 --> 00:43:30,128

Yeah.

589

00:43:30,128 --> 00:43:30,768

Yeah.

590

00:43:30,768 --> 00:43:30,998

Yeah.

591

00:43:30,998 --> 00:43:31,789

Great show.

592

00:43:31,789 --> 00:43:33,300

I like them a lot.

593

00:43:33,300 --> 00:43:38,263

put a lot of work into Jeremy and Andre do a lot of work to get.

594

00:43:38,263 --> 00:43:41,596

Kind of all of the last week's news constant in there.

595

00:43:41,596 --> 00:43:42,976

so it's impressive.

596

00:43:43,257 --> 00:43:51,794

It allowed me to flip from being this person where prior to finding that show and I found

it cause Jeremy was a guest on my show.

597

00:43:52,004 --> 00:43:53,375

was an amazing guest by the way.

598

00:43:53,375 --> 00:44:01,542

don't know if he'd have much to say about Bayesian statistics, but he's an incredibly

brilliant person is so enjoyable to listen to.

599

00:44:02,133 --> 00:44:04,495

and, and someone else that I'd love to make an intro for you.

600

00:44:04,495 --> 00:44:06,616

He's, he's become a friend over the years.

601

00:44:06,616 --> 00:44:07,777

Yeah, for sure.

602

00:44:08,001 --> 00:44:20,124

But yeah, last week in AI, they, I don't know why I'm talking about it so much, but they,

I went from being somebody who would kind of have this attitude when somebody would say,

603

00:44:20,124 --> 00:44:28,346

if you heard about this release or that, or, I'd say, you know, just because I work in AI,

I can't stay on top of every little thing that comes out.

604

00:44:28,427 --> 00:44:36,379

And now since I started listening to last week in AI about a year ago, I don't think

anybody's caught me off guard with some, with some new release.

605

00:44:36,379 --> 00:44:37,929

I'm like, I know.

606

00:44:39,306 --> 00:44:40,487

Yeah, well done.

607

00:44:40,487 --> 00:44:42,307

Yeah, that's good.

608

00:44:42,329 --> 00:44:45,451

Yeah, but that makes your life hard.

609

00:44:45,451 --> 00:44:45,882

Yeah, for sure.

610

00:44:45,882 --> 00:44:48,353

If you don't have a commute, come on.

611

00:44:48,794 --> 00:44:55,319

But I'd love to be able to completely submerge myself in Bayesian statistics.

612

00:44:55,319 --> 00:45:06,448

is a life goal of mine, is to be able to completely, because while I have done some

Bayesian stuff and in my PhD, I did some Markov chain Monte Carlo work.

613

00:45:09,089 --> 00:45:15,684

And there's just obviously so much flexibility and nuance to this space.

614

00:45:15,684 --> 00:45:18,216

can do such beautiful things.

615

00:45:18,517 --> 00:45:21,679

I have a huge fan of Bayesian stats.

616

00:45:21,679 --> 00:45:24,101

And so yeah, it's really great to have you on the show talking about it.

617

00:45:24,101 --> 00:45:31,657

So, Pi MC, which we've been talking about now, kind of going back to our, back to our

thread.

618

00:45:31,898 --> 00:45:38,861

Pi MC uses something called Pi tensor to leverage GPU acceleration and complex graph

optimizations.

619

00:45:38,861 --> 00:45:43,563

Tell us about PyTensor and how this impacts the performance and scalability of Bayesian

models.

620

00:45:44,124 --> 00:45:46,385

Yeah.

621

00:45:46,545 --> 00:45:47,906

Great question.

622

00:45:49,447 --> 00:45:55,350

basically the way PyMAC is built is we need a backend.

623

00:45:56,011 --> 00:46:04,716

And historically this has been a complicated topic because the backend, then we had to do

the computation.

624

00:46:04,716 --> 00:46:07,647

Otherwise you have to do the computations in Python.

625

00:46:07,681 --> 00:46:11,644

And that's slower than doing it in C, for instance.

626

00:46:11,724 --> 00:46:16,709

And so we have still that C backend.

627

00:46:16,709 --> 00:46:23,794

That's kind of a historical remnant, but more and more we're using.

628

00:46:23,794 --> 00:46:28,428

when I say we, I don't do a lot of PyTensor code to be honest.

629

00:46:28,428 --> 00:46:29,959

mean, contributions to PyTensor.

630

00:46:29,959 --> 00:46:31,521

I mainly contribute to PyC.

631

00:46:31,521 --> 00:46:36,064

PyTensor is spearheaded a lot by Ricardo Viera.

632

00:46:36,325 --> 00:46:36,909

Great.

633

00:46:36,909 --> 00:46:41,810

great guy, extremely good modeler.

634

00:46:41,850 --> 00:46:52,633

basically the idea of PyTensor is to kind of outsource the computation basically that PymC

is doing.

635

00:46:52,952 --> 00:47:04,136

And then, especially when you're doing the sampling, PyTensor is going to delegate that to

some other backends.

636

00:47:04,136 --> 00:47:07,017

And so now instead of having just the C backend,

637

00:47:07,127 --> 00:47:13,543

you can actually sample your PIMC models with the number backend.

638

00:47:14,164 --> 00:47:15,034

How do you do that?

639

00:47:15,034 --> 00:47:23,923

You use another package that's called nutpy that's been built by Adrian Seybold, extremely

brilliant guy again.

640

00:47:23,923 --> 00:47:27,656

I'm like surrounded by guys who are much more brilliant than me.

641

00:47:28,117 --> 00:47:30,519

And that's how I learned basically.

642

00:47:30,560 --> 00:47:32,481

I just ask them questions.

643

00:47:32,853 --> 00:47:37,156

That's what I feel like in my day job at Nebula, my software company.

644

00:47:37,156 --> 00:47:40,988

just like, Yeah, sorry.

645

00:47:40,988 --> 00:47:42,878

I'm just completely interrupting you.

646

00:47:42,979 --> 00:47:44,179

Yeah, no, same.

647

00:47:44,179 --> 00:47:45,260

And so, yeah.

648

00:47:45,260 --> 00:47:56,365

So Adrian basically re -implemented HMC with NutPy, but using Numba and Rust.

649

00:47:56,406 --> 00:48:02,689

And so that goes way faster than just using Python or even just using C.

650

00:48:04,318 --> 00:48:19,068

And then you can also sample your models with two other backends that we have that's

enabled by PyTensor that then basically compiles the graph of the model and then delegates

651

00:48:19,068 --> 00:48:22,741

these operations, computational operations, to the sampler.

652

00:48:22,741 --> 00:48:28,034

And then the sampler, as I was saying, can be the one from NutPy, which is in Rust and

Numba.

653

00:48:28,295 --> 00:48:31,657

And otherwise, it can be the one from NumPyro.

654

00:48:31,999 --> 00:48:35,671

actually, you can call the NumPyro sampler with a PIMC model.

655

00:48:35,671 --> 00:48:36,681

And it's just super simple.

656

00:48:36,681 --> 00:48:44,954

Like in pm .sample, you're just like, there's a keyword argument that's nuts underscore

sampler and you just say nutpy or NumPyro.

657

00:48:44,954 --> 00:48:55,759

And I tend to use NumPyro a lot when I'm doing Gaussian processes because I don't know

why, but so most of the time using nutpy, but when I'm doing Gaussian processes somewhere

658

00:48:55,759 --> 00:49:01,909

in the model, I tend to use NumPyro because like for some reason in their routine, in

their

659

00:49:01,909 --> 00:49:09,175

algorithm, there is some efficiency they have in the way they compute the matrices.

660

00:49:09,316 --> 00:49:12,678

And GPs are basically huge matrices and dot products.

661

00:49:12,839 --> 00:49:17,403

so yeah, like usually NumPyRoll is usually very efficient for that.

662

00:49:17,403 --> 00:49:21,496

And you can also use Jax now to sample your model.

663

00:49:21,496 --> 00:49:29,513

So we have like these different backends and it's enabled because PyTensor is that

664

00:49:29,537 --> 00:49:34,799

backend that nobody sees most of the time you're not implementing a Python or operation in

your models.

665

00:49:34,799 --> 00:49:41,760

Sometimes we do that on PNC Labs when we're working on a very custom operation, but

usually it's done under the hood for you.

666

00:49:42,241 --> 00:49:55,344

And then Python compiles the graph, the symbolic graph, can dispatch that afterwards to

whatever the best way of computing the posterior distribution afterwards is.

667

00:49:55,484 --> 00:49:56,444

Nice.

668

00:49:56,465 --> 00:49:58,049

You alluded there.

669

00:49:58,049 --> 00:50:03,061

to something that I've been meaning to get to asking you about, which is the Pi MC labs

team.

670

00:50:03,061 --> 00:50:08,043

So you have Pi MC, the open source library that anybody listening can download.

671

00:50:08,043 --> 00:50:15,006

And of course I haven't shown us for people to download and they can get rolling on doing

their Bayesian stats right now, whether they're, it's already something they have

672

00:50:15,006 --> 00:50:16,496

expertise in or not.

673

00:50:16,697 --> 00:50:18,297

Hi, MC labs.

674

00:50:18,938 --> 00:50:27,241

It sounds like you're responsible and just fill us in, but I'm kind of, gathering that the

team there is responsible both for developing.

675

00:50:27,241 --> 00:50:34,967

IMC, but also for consulting because you kind of you mentioned there, you know, sometimes

we might do some kind of custom implementation.

676

00:50:34,967 --> 00:50:38,060

So first of all, yeah, tell us a little bit about PI MC labs.

677

00:50:38,060 --> 00:50:52,792

And then it'd be really interesting to hear one or more interesting examples of how

Bayesian statistics allows some client or some use case, allows them to do something that

678

00:50:52,792 --> 00:50:55,083

they wouldn't be able to do with another approach.

679

00:50:56,605 --> 00:50:57,405

Yeah.

680

00:50:58,297 --> 00:51:05,278

So yeah, first, go install PyMC on GitHub and open PRs and stuff like that.

681

00:51:05,278 --> 00:51:06,700

We always love that.

682

00:51:07,800 --> 00:51:09,651

And second, yeah, exactly.

683

00:51:09,651 --> 00:51:21,504

PyMC is kind of an offspring of PyMC in the sense that everybody on the team is a PyMC

developer.

684

00:51:21,504 --> 00:51:23,055

So we contribute to PyMC.

685

00:51:23,055 --> 00:51:23,915

This is open source.

686

00:51:23,915 --> 00:51:24,745

This is

687

00:51:25,053 --> 00:51:25,954

free.

688

00:51:25,954 --> 00:51:29,175

This is free and always will be as it goes.

689

00:51:29,436 --> 00:51:35,200

But then on top of that, we do consulting.

690

00:51:35,340 --> 00:51:36,341

what's that about?

691

00:51:36,341 --> 00:51:46,787

Well, most of the time, these are clients who want to do something with PMC or even more

general with patient statistics.

692

00:51:47,008 --> 00:51:54,553

And they know we do that and they do not know how to do that either because they don't

have the time or to

693

00:51:54,871 --> 00:52:03,768

train themselves or they don't want to, or they don't have the money to hire a Bayesian

modeler full time, various reasons.

694

00:52:03,768 --> 00:52:09,013

But basically, yeah, like they are stuck in at some point in the modeling workflow, they

are stuck.

695

00:52:09,013 --> 00:52:11,034

It can be at the very beginning.

696

00:52:11,034 --> 00:52:13,556

It can be, well, I've tried a bunch of stuff.

697

00:52:13,556 --> 00:52:16,678

I can't make the model converge and I don't know why.

698

00:52:17,159 --> 00:52:22,623

So it can be like a very wide array of situations.

699

00:52:22,624 --> 00:52:24,329

Most of the time people know.

700

00:52:24,329 --> 00:52:34,476

us because like me for the podcast or for PMC, most of the other guys for PMC or for other

technical writing that they do around.

701

00:52:34,476 --> 00:52:41,000

So basically that's like, that's not really a real company, but just a bunch of nerds if

you want.

702

00:52:41,881 --> 00:52:49,606

But no, that's a real company, but we like to define us as a bunch of nerds because like

that's how it really started.

703

00:52:50,587 --> 00:52:53,509

And it, in a sense of

704

00:52:53,749 --> 00:53:00,311

you guys actually consulting with companies and making an impact in that sense, it is

certainly a company.

705

00:53:00,471 --> 00:53:01,071

Yeah.

706

00:53:01,071 --> 00:53:01,321

So yeah.

707

00:53:01,321 --> 00:53:03,432

So tell us a bit about projects.

708

00:53:03,432 --> 00:53:12,454

mean, you don't need to go into detail with client names or whatever, if that's

inappropriate, but it would be interesting to hear some examples of use cases, use cases

709

00:53:12,454 --> 00:53:18,236

of Bayesian statistics in the wild, enabling capabilities that other kinds of modeling

approaches wouldn't.

710

00:53:18,376 --> 00:53:18,706

Yeah.

711

00:53:18,706 --> 00:53:18,986

Yeah.

712

00:53:18,986 --> 00:53:20,277

Yeah.

713

00:53:20,277 --> 00:53:21,517

No, definitely.

714

00:53:22,411 --> 00:53:29,203

Yeah, so of course I cannot enter into the details, but I can definitely give you some

ideas.

715

00:53:29,243 --> 00:53:42,827

When I can actually enter into the details is a project we did for an NGO in Estonia,

where they were getting polling data.

716

00:53:42,827 --> 00:53:49,769

So every month they do a poll of Estonian citizens about various questions.

717

00:53:49,769 --> 00:53:51,649

These can be horse -pull.

718

00:53:52,049 --> 00:53:54,731

races, horse races, polls.

719

00:53:55,212 --> 00:54:05,100

but this can be also, you know, news questions like, do you think Estonia should ramp up

the number of soldiers at the border with Russia?

720

00:54:05,621 --> 00:54:09,304

do you think, same sex marriage should be legal?

721

00:54:09,304 --> 00:54:10,564

Things like that.

722

00:54:11,075 --> 00:54:13,117

I hear some Overton window coming on.

723

00:54:13,117 --> 00:54:18,451

No, that's what I thought.

724

00:54:18,451 --> 00:54:21,037

I thought we might go there.

725

00:54:21,037 --> 00:54:32,057

Yeah, this is now I'm completely taking you off on a sidetrack, but Serge Macisse, our

researcher came up with a great question for you because you had Alan Downey on your show.

726

00:54:32,057 --> 00:54:34,037

He's an incredible guest.

727

00:54:34,037 --> 00:54:37,377

I absolutely loved having him on our program.

728

00:54:37,377 --> 00:54:41,957

So he was on here in episode number 715.

729

00:54:42,097 --> 00:54:48,567

And in that episode, we talked about the Overton window, which is related to what you're

kind of just talking about.

730

00:54:48,567 --> 00:54:50,605

So kind of, you know, people

731

00:54:51,309 --> 00:54:57,129

What is, how does society think about say same sex marriage?

732

00:54:57,129 --> 00:55:05,509

Where, know, if you looked a hundred years ago or a thousand years ago or 10 ,000 years

ago or a thousand years into the future or 10 years into the future at each of those

733

00:55:05,509 --> 00:55:15,369

different time points, there's a completely, well, maybe not completely different, but

there's a there's a varying range of people who think, you know, what's acceptable or

734

00:55:15,369 --> 00:55:16,349

what's not acceptable.

735

00:55:16,349 --> 00:55:18,565

And so, and this is

736

00:55:19,861 --> 00:55:23,502

You we were talking earlier in the episode about bias, so it kind of ties into this.

737

00:55:23,623 --> 00:55:34,728

You, you know, you might have your idea as a listener to the show, you might be a

scientist or an engineer and you think, I am unbiased, you know, I, know the real thing

738

00:55:34,728 --> 00:55:39,670

and, but you don't because you are a product of your times.

739

00:55:39,670 --> 00:55:44,211

And the Overton window is kind of a way of describing this on any given issue.

740

00:55:44,992 --> 00:55:46,993

There is some range.

741

00:55:48,181 --> 00:55:56,726

And it would fit a probability distribution where, you know, there's some people on a far

extreme one way and some people on a far extreme the other way.

742

00:55:56,726 --> 00:56:04,491

But in general, all of society is moving in one direction, typically in a liberal

direction on a given social issue.

743

00:56:04,491 --> 00:56:06,142

And this varies by region.

744

00:56:06,142 --> 00:56:07,993

It varies by age.

745

00:56:07,993 --> 00:56:10,634

Anyway, I think Overton windows are really fascinating.

746

00:56:11,075 --> 00:56:13,376

and, yeah.

747

00:56:13,697 --> 00:56:14,199

748

00:56:14,199 --> 00:56:19,971

Completely derailed your conversation, but I have feeling you're going to have something

interesting to say.

749

00:56:20,011 --> 00:56:23,873

Yeah, no, mean, that's related to that for sure.

750

00:56:25,294 --> 00:56:33,387

yeah, basically, and that's really because yeah, we like I had also Alan Downey on the

show for his latest book.

751

00:56:33,387 --> 00:56:37,459

And that was also definitely about that.

752

00:56:37,459 --> 00:56:40,000

So probably overthinking it was the book.

753

00:56:40,000 --> 00:56:40,900

Yeah.

754

00:56:40,900 --> 00:56:41,340

Yeah.

755

00:56:41,340 --> 00:56:41,690

Great.

756

00:56:41,690 --> 00:56:41,890

Great.

757

00:56:41,890 --> 00:56:42,060

Great.

758

00:56:42,060 --> 00:56:43,581

And yeah, great book.

759

00:56:43,581 --> 00:56:50,946

And so basically they like the, NGO, have, they had these survey data, right?

760

00:56:50,946 --> 00:56:58,691

And they're like, but there are the clients have questions and their clients are usually

media or politicians.

761

00:56:58,691 --> 00:57:07,117

it's like, yeah, but I'd like to know on a geographical basis, you know, like in these

electoral districts, what do people think about that?

762

00:57:07,117 --> 00:57:13,281

Or in these electoral district, female educated.

763

00:57:13,289 --> 00:57:18,151

of that age, what do they think about same -sex marriage?

764

00:57:18,932 --> 00:57:24,555

That's hard to do because polling at that scale is almost impossible.

765

00:57:24,556 --> 00:57:26,957

It costs a ton of money.

766

00:57:27,577 --> 00:57:34,591

Also, polling is harder and harder because people answer less and less to polls.

767

00:57:34,591 --> 00:57:41,305

At the same time, the polling data becomes less available and less reliable, but

768

00:57:41,503 --> 00:57:47,815

you have people who get more interested in what the polls have to say.

769

00:57:47,895 --> 00:57:50,095

It's hard.

770

00:57:50,315 --> 00:57:53,336

There is a great method to do that.

771

00:57:53,636 --> 00:58:04,439

What we did for them is come up with a hierarchical model of the population because

hierarchical models allow you to share information between groups.

772

00:58:04,439 --> 00:58:07,740

Here the groups could be the age groups, for instance.

773

00:58:07,740 --> 00:58:11,021

Basically, knowing something what a hierarchical model says,

774

00:58:11,021 --> 00:58:16,201

is, well, age groups are different, but they are not infinitely different.

775

00:58:16,201 --> 00:58:34,201

So learning about what someone aged 16 to 24 thinks about same -sex marriage actually

already tells you something about what someone aged 25 to 34 thinks about that.

776

00:58:34,281 --> 00:58:41,221

And the degree of similarity between these responses is estimated by the model.

777

00:58:41,261 --> 00:58:43,142

So these models are extremely powerful.

778

00:58:43,142 --> 00:58:44,323

I love them.

779

00:58:44,323 --> 00:58:45,274

I teach them a lot.

780

00:58:45,274 --> 00:58:50,778

And actually in the advanced regression course, the last lesson is all about hierarchical

models.

781

00:58:50,778 --> 00:58:59,384

And I actually walk you through a simplified version of the model we did at Pintsy Labs

for that NGO called SONC in Estonia.

782

00:58:59,384 --> 00:59:04,557

So it's like a model that's used in industry for real.

783

00:59:04,737 --> 00:59:05,768

you learn that.

784

00:59:05,768 --> 00:59:09,220

That's a hard model, but that's a real model.

785

00:59:10,689 --> 00:59:16,110

Then once you've done that, you do something that's called post stratification.

786

00:59:16,471 --> 00:59:27,704

And post stratification is basically a way of debiasing your estimates, your predictions

from the model, and you use census data to do that.

787

00:59:27,704 --> 00:59:30,614

So you need good data and you need census data.

788

00:59:30,614 --> 00:59:37,916

But if you have good census data, then you're going to be able to basically reweight the

predictions from your model.

789

00:59:38,677 --> 00:59:40,621

And that way, if you combine

790

00:59:40,621 --> 00:59:57,681

post -certification and hierarchical model, you're going to be able to give actually good

estimates of what females, educated, age 25, 34 in this electoral district think about

791

00:59:57,681 --> 00:59:58,841

that issue.

792

00:59:58,901 --> 01:00:08,661

And when I say good, I want to say that it's like the confidence intervals are not going

to be ridiculous.

793

01:00:08,661 --> 01:00:10,253

It's not going to tell you

794

01:00:10,253 --> 01:00:21,753

Well, these population think is opposed to gay marriage with a probability of 20 to 80%,

which just covers basically everything.

795

01:00:21,753 --> 01:00:23,793

So that's not very actionable.

796

01:00:23,793 --> 01:00:34,513

No, the model has like, it's more uncertain, of course, but it has a really good way of

giving you something actually actionable.

797

01:00:34,593 --> 01:00:36,893

So that was a big project.

798

01:00:36,893 --> 01:00:40,453

I can dive into some others if you want, but that...

799

01:00:40,625 --> 01:00:43,136

That takes some, I don't want to deride the interview.

800

01:00:43,136 --> 01:00:45,567

That's great and highly illustrative.

801

01:00:45,567 --> 01:00:53,170

It gives that sense of how with a Bayesian model, you can be so specific about how

different parts of the data interrelate.

802

01:00:53,170 --> 01:01:04,815

So in this case, for example, you're describing having different demographic groups that

have some commonality, like all the women, but different age groups of women as a sub

803

01:01:04,975 --> 01:01:09,097

node, as sub nodes of women in general.

804

01:01:09,377 --> 01:01:16,682

That way you're able to use the data from each of the subgroups to influence your higher

level group.

805

01:01:16,682 --> 01:01:32,032

And actually something that might be interesting to you, Alex, is that my introduction to

both our programming and I guess, well, to hierarchical modeling is Gelman and Hill's

806

01:01:32,513 --> 01:01:36,796

book, which yeah, obviously Andrew Gelman you've already talked about on the show.

807

01:01:36,796 --> 01:01:38,797

Jennifer Hill also

808

01:01:38,797 --> 01:01:46,417

brilliant causal modeler and has also been on the Super Data Science podcast.

809

01:01:46,637 --> 01:01:49,167

And that was episode number 607.

810

01:01:49,167 --> 01:01:56,657

Anyway, we're getting into lots of, there's lots of listening for people to do out there

between your show and mine based on guests that we've talked about on the program.

811

01:01:56,677 --> 01:01:59,437

Hopefully lots of people with long commutes.

812

01:02:02,017 --> 01:02:04,617

So yeah, fantastic.

813

01:02:05,317 --> 01:02:06,977

That's a great example.

814

01:02:06,977 --> 01:02:07,997

Alex.

815

01:02:08,051 --> 01:02:18,184

Another library, open source library, in addition to PyMC that you've developed is Rviz,

which has nothing to do with the programming language R.

816

01:02:19,524 --> 01:02:24,985

So it's A -R -V -I -Z or Zed, Rviz.

817

01:02:25,126 --> 01:02:30,447

And this is for post -modeling workflows in VasionStats.

818

01:02:30,447 --> 01:02:34,768

So tell us about why, you know, what is post -modeling workflows?

819

01:02:34,768 --> 01:02:35,418

What does that matter?

820

01:02:35,418 --> 01:02:37,825

And how does Rviz solve problems for us there?

821

01:02:37,825 --> 01:02:39,646

Yeah, yeah, great questions.

822

01:02:39,966 --> 01:02:52,871

And I'll make sure to also, before it related to your previous question, send you some

links with other projects that could be interesting to people like MediaMix Models.

823

01:02:52,871 --> 01:02:56,623

I've interviewed Luciano Paz on the show.

824

01:02:56,623 --> 01:03:03,095

We've worked with HelloFresh, for instance, to come up with a MediaMix Marketing Model for

them.

825

01:03:03,095 --> 01:03:07,155

Luciano talks about that in that episode.

826

01:03:07,155 --> 01:03:13,350

Also send you a blog post with spatial data, with Gaussian processes.

827

01:03:13,350 --> 01:03:16,493

That's something we've done for an agricultural client.

828

01:03:16,493 --> 01:03:26,861

And I already sent you a link of a video webinar we did with that NGO, with that client in

Estonia.

829

01:03:27,001 --> 01:03:31,965

And we talked, we go a bit deeper into the project.

830

01:03:32,466 --> 01:03:34,507

And I'll send you of course also the...

831

01:03:36,727 --> 01:03:44,903

the Learn Based Stats episode because the president, Thermo, the president of that NGO was

on the show.

832

01:03:44,903 --> 01:03:45,214

Nice.

833

01:03:45,214 --> 01:03:45,414

Yeah.

834

01:03:45,414 --> 01:03:48,187

I'll be sure, of course, to include all of those links in the show notes.

835

01:03:48,187 --> 01:03:48,537

Yeah.

836

01:03:48,537 --> 01:03:48,717

Yeah.

837

01:03:48,717 --> 01:03:58,996

Because I guess people come from different backgrounds and so someone is going to be more

interested in marketing, another one more in social science, another one more in spatial

838

01:03:58,996 --> 01:03:59,396

data.

839

01:03:59,396 --> 01:04:03,620

So that way people can pick and choose what they are most curious about.

840

01:04:03,620 --> 01:04:04,310

So obvious.

841

01:04:04,310 --> 01:04:05,160

Yeah.

842

01:04:05,291 --> 01:04:07,282

What is it?

843

01:04:07,282 --> 01:04:14,025

That's basically your friend for any post -model, post -sampling graph.

844

01:04:14,025 --> 01:04:15,365

And why is that important?

845

01:04:15,365 --> 01:04:20,877

Because actually models steal the show and they are the star on the show.

846

01:04:20,877 --> 01:04:27,430

But a model is just one part of what we call the Bayesian workflow.

847

01:04:27,430 --> 01:04:30,371

And the Bayesian workflow just has one step, which is the modeling.

848

01:04:30,371 --> 01:04:35,273

But all the other steps don't have to do anything with the model.

849

01:04:36,117 --> 01:04:41,199

There is a lot of steps before sampling the model and then there is a lot of steps

afterwards.

850

01:04:41,199 --> 01:04:46,640

And I would argue that these steps afterwards are almost as important as the model.

851

01:04:46,820 --> 01:04:47,360

Why?

852

01:04:47,360 --> 01:04:51,461

Because it's what's going to face the customer of the model.

853

01:04:51,522 --> 01:05:04,065

Your model is going to be consumed by people who most of the time don't know about models

and also often don't care about models.

854

01:05:04,513 --> 01:05:11,689

That's a shame because I love models, but you know, lots of the time they don't really

care about the model.

855

01:05:11,689 --> 01:05:13,320

They care about the results.

856

01:05:13,901 --> 01:05:29,353

And so a big part of your job as the modeler is to be able to convey that information in a

way that someone who is not a stat person, a math person can understand and use in their

857

01:05:29,353 --> 01:05:30,274

work.

858

01:05:30,995 --> 01:05:34,017

Whether that is a football coach.

859

01:05:34,125 --> 01:05:49,289

or a data scientist or someone working in HelloFresh marketing department, you have to

adapt the way you talk to those people and the way you present the results of the model.

860

01:05:49,289 --> 01:05:53,110

And the way you do that is with amazing graphs.

861

01:05:53,110 --> 01:06:03,253

So a lot of your time as a modeler is spent thinking out how to decipher what the model

can tell, what the model cannot tell.

862

01:06:03,253 --> 01:06:14,963

also very important, and with which confidence, and since we're humans, we use our eyes a

lot, and the way to convey that is with plots.

863

01:06:14,963 --> 01:06:24,811

And so you spend a lot of time plotting stuff as a Bayesian modeler, especially because

Bayesian models don't give you, you one -point estimate.

864

01:06:24,811 --> 01:06:27,793

They give you full distributions for all the parameters.

865

01:06:27,854 --> 01:06:30,366

So you get distributions all the way down.

866

01:06:30,366 --> 01:06:31,116

So,

867

01:06:31,639 --> 01:06:41,836

That's a bit more complex to wrap your head around at the beginning, but once your brain

is used to that gym, that's really cool because that gives you opportunities for amazing

868

01:06:41,836 --> 01:06:42,736

plots.

869

01:06:43,637 --> 01:06:48,561

yeah, like RVs is here for you for that.

870

01:06:48,561 --> 01:06:54,985

It has a lot of the plots that we use all the time in the Bayesian workflow.

871

01:06:54,985 --> 01:06:57,346

One, to diagnose your model.

872

01:06:57,387 --> 01:07:01,479

So to understand if there is any red flag in the convergence of the model.

873

01:07:01,479 --> 01:07:09,502

And then once you're sure about the quality of your results, then how do you present that

to the customer of the model?

874

01:07:09,502 --> 01:07:13,195

then, then obvious also has a lot of plots for you here.

875

01:07:13,195 --> 01:07:16,437

And the cool thing of obvious is that it's platform agnostic.

876

01:07:16,437 --> 01:07:17,578

What do I mean by that?

877

01:07:17,578 --> 01:07:29,554

It's that you can run your model in PIMC in a pyro in Stan, and then use obvious because

obvious is expecting a special format of data that all these PPLs can give you, which is

878

01:07:29,554 --> 01:07:31,211

called the inference data object.

879

01:07:31,211 --> 01:07:33,882

Once you have that, ARVIS doesn't care where the model was run.

880

01:07:33,882 --> 01:07:35,042

And that's super cool.

881

01:07:35,042 --> 01:07:37,923

And also it's available in Julia.

882

01:07:38,183 --> 01:07:42,404

So that's Python package, but there is the Julia equivalent for people who use Julia.

883

01:07:42,404 --> 01:07:50,466

So yeah, it's a very good way of starting that part of the workflow, which is extremely

important.

884

01:07:50,506 --> 01:07:50,987

Nice.

885

01:07:50,987 --> 01:07:52,207

That was a great tour.

886

01:07:52,207 --> 01:08:00,149

And of course, I will again have a link to ARVIS in the show notes for people who want to

be using that for your post modeling needs with your vision models.

887

01:08:00,269 --> 01:08:07,012

including diagnostics like looking for red flags and being able to visualize results and

passes off to whoever the end client is.

888

01:08:07,593 --> 01:08:14,236

I think it might be in the same panel discussion with the head of that NGO, Tormo Uristo.

889

01:08:15,558 --> 01:08:16,438

Yes.

890

01:08:16,898 --> 01:08:18,079

That's my Spanish.

891

01:08:18,079 --> 01:08:22,381

I'm in Argentina right now, so the Spanish is automatic.

892

01:08:22,962 --> 01:08:27,444

Actually, I'm relieved to know that you're in Argentina because I was worried that I was

keeping you up way too late.

893

01:08:27,604 --> 01:08:28,725

No, no, no, no, no.

894

01:08:28,725 --> 01:08:29,545

Nice.

895

01:08:30,385 --> 01:08:39,888

So yeah, so in that interview, Tarmu talks about adding components like Gaussian processes

to make models, Bayesian models, time aware.

896

01:08:40,308 --> 01:08:41,388

What does that mean?

897

01:08:41,388 --> 01:08:48,570

And what are the advantages and potential pitfalls of incorporating advanced features like

time awareness into Bayesian models?

898

01:08:48,730 --> 01:08:54,222

Yeah, great research, can see that.

899

01:08:54,222 --> 01:08:56,243

Great research, Serge Pessis, really.

900

01:08:56,243 --> 01:08:57,433

Yeah.

901

01:08:57,709 --> 01:08:59,469

No, that's impressive.

902

01:08:59,849 --> 01:09:03,509

I had a call with the people from Google Gemini today.

903

01:09:04,769 --> 01:09:20,549

So they're very much near the cutting edge of developing Google Gemini alongside Cloud 3

for Anthropic and of course, GPT -4, GPT -4 -0, whatever, from OpenAI.

904

01:09:20,549 --> 01:09:24,709

These are the frontier of LLMs.

905

01:09:24,709 --> 01:09:27,073

So I'm on a call with...

906

01:09:27,073 --> 01:09:29,934

half a dozen people from the Google Gemini team.

907

01:09:29,934 --> 01:09:34,065

And they were insinuating kind of near the end with some of the new capabilities they

have.

908

01:09:34,065 --> 01:09:37,326

And there are some cool things in there, which I need to spend more time playing around

with like gems.

909

01:09:37,326 --> 01:09:46,078

I don't know if you've seen this, but the gems in Google Gemini, they allow you to have a

context for different kinds of tasks.

910

01:09:46,078 --> 01:09:56,831

like, for example, there are some parts of my podcast production workflow where I have

different context, different needs.

911

01:09:56,831 --> 01:09:58,141

at each of those steps.

912

01:09:58,141 --> 01:10:04,263

And so it's very helpful with these Google Gemini gems to be able to just click on that

and be like, okay, now I'm in this kind of context.

913

01:10:04,583 --> 01:10:08,924

I'm expecting the, the LLM to output in this particular way.

914

01:10:09,384 --> 01:10:20,367

And the Google Gemini people said, well, and maybe you'll be able to use these gems to

kind of be replacing, you know, within the workflow of other people working on your

915

01:10:20,367 --> 01:10:21,718

podcast, you'll be able to use them to replace.

916

01:10:21,718 --> 01:10:26,483

I was like, you know, you know, for example, they give the example of research and I was

like,

917

01:10:26,483 --> 01:10:33,406

I hope that our researcher, for example, is using generative AI tools to assist his work.

918

01:10:33,406 --> 01:10:38,598

But I think we're quite a ways away with all of the amazing things that alums can do.

919

01:10:38,598 --> 01:10:43,010

I think we're still quite a ways from like the kind of quality of research that search mrs

can do for this show.

920

01:10:43,010 --> 01:10:44,891

Yeah, yeah, yeah.

921

01:10:44,891 --> 01:10:46,461

We're still a ways away.

922

01:10:46,741 --> 01:10:48,942

Yeah, yeah, no, no, for sure.

923

01:10:48,942 --> 01:10:50,483

But that sounds like fun.

924

01:10:50,483 --> 01:10:51,843

Yeah.

925

01:10:51,843 --> 01:10:54,744

Anyway, sorry, I derailed you again.

926

01:10:54,945 --> 01:10:56,205

Time warrants.

927

01:10:56,957 --> 01:10:59,448

Yeah, Indeed.

928

01:10:59,448 --> 01:11:01,859

And then I love that question because I love GPs.

929

01:11:01,859 --> 01:11:03,699

So thanks a lot.

930

01:11:03,979 --> 01:11:11,321

And that was not at all a setup for the audience, processes, GPs, Yeah, I love Gaussian

processes.

931

01:11:11,321 --> 01:11:23,284

And actually just sent you also a blog post we have on the PrimeCLabs website by Luciano

Paz about how to use Gaussian processes with spatial data.

932

01:11:23,464 --> 01:11:25,709

So why am I...

933

01:11:25,709 --> 01:11:31,072

telling you that because Gaussian processes are awesome because they are extremely

versatile.

934

01:11:31,072 --> 01:11:37,196

It's what's called a non -parametric that allows you to do non -parametric models.

935

01:11:37,196 --> 01:11:38,296

What does that mean?

936

01:11:38,296 --> 01:11:52,464

It means that instead of having, for instance, a linear regression where you have a

functional form that you're telling the model, I expect the relationship between X and Y

937

01:11:52,464 --> 01:11:54,505

to be of a linear form.

938

01:11:54,785 --> 01:11:57,808

y equals a plus b times x.

939

01:11:57,808 --> 01:12:04,353

Now, what the Gaussian process is saying is, I don't know the functional for between x and

940

01:12:04,353 --> 01:12:06,935

I want you to discover it for me.

941

01:12:06,935 --> 01:12:09,797

So that's one level up, if you want, in the abstraction.

942

01:12:09,797 --> 01:12:12,960

And so that's saying y equals f of x.

943

01:12:12,960 --> 01:12:14,561

Find which x it is.

944

01:12:14,561 --> 01:12:18,505

So you don't want to do that all the time because that's very hard.

945

01:12:18,505 --> 01:12:24,225

And actually, you need to use quite a lot of domain knowledge on some of the parameters of

the GPs.

946

01:12:24,225 --> 01:12:29,589

But I want to turn to the details here, but I'll give you some links for the show notes.

947

01:12:29,589 --> 01:12:44,749

But something that's very interesting to apply GPs on is, well, spatial data, as I just

mentioned, because you don't really know in a plot, for instance, of, so not a plot graph,

948

01:12:44,749 --> 01:12:46,740

but a plot, a field plot.

949

01:12:46,740 --> 01:12:54,045

There are some interactions between where you are in the plot and the crops that you're

going to plant on there.

950

01:12:54,295 --> 01:12:56,406

But you don't really know what those interactions are.

951

01:12:56,406 --> 01:12:59,647

Like it interacts with the weather also with a lot of things.

952

01:12:59,647 --> 01:13:02,569

And you don't really know what the functional form of that is.

953

01:13:02,569 --> 01:13:16,304

And so that's where GP here is going to be extremely interesting because it's going to

allow you to see in 2D, try and find out what these correlation between X and Y are and

954

01:13:16,304 --> 01:13:18,255

take that into account in your model.

955

01:13:18,255 --> 01:13:23,007

That's very abstract, but I'm going to link afterwards to a tutorial.

956

01:13:23,147 --> 01:13:36,144

We actually just released today in PIMC a tutorial that I've been working on with Bill

Engels, who is also a GP expert.

957

01:13:36,545 --> 01:13:42,858

And I've been working with him on this tutorial for an approximation, a new approximation

of GPs.

958

01:13:42,858 --> 01:13:44,950

And I'll get back to that in a few minutes.

959

01:13:44,950 --> 01:13:48,171

But first, why GPs in time?

960

01:13:48,171 --> 01:13:51,403

So you can apply GPs on spatial data on space.

961

01:13:51,895 --> 01:13:55,127

But then you can also apply GPs on time.

962

01:13:55,588 --> 01:13:58,450

Time is 1D most of the time, one dimensional.

963

01:13:58,450 --> 01:14:00,051

Space is usually 2D.

964

01:14:00,051 --> 01:14:02,043

And you can actually do GPs in 3D.

965

01:14:02,043 --> 01:14:05,375

You can do spatial temporal GPs.

966

01:14:05,375 --> 01:14:05,885

That's it.

967

01:14:05,885 --> 01:14:07,677

That's even more complicated.

968

01:14:07,677 --> 01:14:12,080

But like 1D GPs, that's really awesome.

969

01:14:12,080 --> 01:14:19,195

Because then most of the time when you have a time dependency, it's non -linear.

970

01:14:20,877 --> 01:14:30,591

For instance, that could be the way the performance of a baseball player evolve within the

season.

971

01:14:30,591 --> 01:14:37,524

You can definitely see the performance of a baseball player fluctuate with time during the

season.

972

01:14:37,724 --> 01:14:43,306

And that would be nonlinear, very probably.

973

01:14:43,626 --> 01:14:47,948

the thing is you don't know what the form of that function is.

974

01:14:47,948 --> 01:14:50,059

And so that's what the GP is here for.

975

01:14:50,059 --> 01:14:55,792

It's going to come and try to discover what the functional form is for you.

976

01:14:56,013 --> 01:15:00,996

And that's why I found GPs extremely, like they are really magical mathematical beasts.

977

01:15:00,996 --> 01:15:04,698

First, they're really like beautiful mathematically.

978

01:15:04,698 --> 01:15:10,441

And a lot of things are actually a special case of GPs, like neural networks.

979

01:15:10,441 --> 01:15:15,584

Neural networks are actually Gaussian processes, but a special case of Gaussian processes.

980

01:15:16,144 --> 01:15:17,575

Gaussian random walk.

981

01:15:17,857 --> 01:15:20,418

They are a special case of Gaussian processes.

982

01:15:20,818 --> 01:15:25,360

So they are a very beautiful mathematical object, but also very practical.

983

01:15:25,900 --> 01:15:33,603

Now, as Uncle Ben said, with great power comes great responsibility.

984

01:15:33,624 --> 01:15:35,945

And GPs are hard to yield.

985

01:15:35,945 --> 01:15:38,486

It's a powerful weapon, but it's hard to yield.

986

01:15:38,486 --> 01:15:39,686

It's like Excalibur.

987

01:15:39,686 --> 01:15:42,587

You have to be worthy to yield it.

988

01:15:42,587 --> 01:15:47,329

And so it takes training and time to...

989

01:15:47,505 --> 01:15:51,307

use them, but it's worth it.

990

01:15:51,407 --> 01:16:00,872

And so we use that with Thermo, Juristo from that Estonian NGO.

991

01:16:01,553 --> 01:16:03,654

But I use that almost all the time.

992

01:16:04,395 --> 01:16:12,559

Right now I'm working more and more on sports data and yeah, I'm actually working on some

football data right now.

993

01:16:12,559 --> 01:16:15,661

And well, you want to take into account

994

01:16:15,661 --> 01:16:17,701

these wheezing season effects from players.

995

01:16:17,701 --> 01:16:19,861

I don't know what the linear form is.

996

01:16:19,861 --> 01:16:25,801

And right now, the first model I did taking the time into account was just a linear trend.

997

01:16:25,801 --> 01:16:30,361

So it's just saying as time passes, you expect a linear change.

998

01:16:30,361 --> 01:16:35,261

So the change from one to two is going to be the same one than the one from nine to 10.

999

01:16:35,421 --> 01:16:37,471

But usually it's not the case with time.

Speaker: 1000

01:16:37,471 --> 01:16:39,041

It's very non -linear.

Speaker: 1001

01:16:39,041 --> 01:16:41,661

And so here, you definitely want to apply a GP on that.

Speaker: 1002

01:16:41,661 --> 01:16:44,587

You could apply other stuff like random walks.

Speaker: 1003

01:16:44,587 --> 01:16:46,338

autoregressive stuff and so on.

Speaker: 1004

01:16:46,338 --> 01:16:49,619

I don't personally don't really like those models.

Speaker: 1005

01:16:49,619 --> 01:16:58,543

find them like, it's like you have to apply that structure to the model, but at the same

time, they're not that easier to use than GPs.

Speaker: 1006

01:16:58,683 --> 01:17:00,984

So, know, might as well use a GP.

Speaker: 1007

01:17:01,584 --> 01:17:13,313

And I'll end this very long answer with a third point, which is now it's actually easier

to use GPs.

Speaker: 1008

01:17:13,313 --> 01:17:18,976

because there is this new decomposition of GPs that's Hilbert space decomposition.

Speaker: 1009

01:17:18,976 --> 01:17:20,737

So HSGP.

Speaker: 1010

01:17:21,638 --> 01:17:28,021

And that's basically a decomposition of GPs that's like a dot product.

Speaker: 1011

01:17:28,021 --> 01:17:31,543

So kind of a linear regression, but that gives you a GP.

Speaker: 1012

01:17:32,324 --> 01:17:42,449

And that's amazing because GPs are known to be extremely slow to sample because it's a lot

of matrix multiplication, as I was saying.

Speaker: 1013

01:17:42,623 --> 01:17:43,733

at some point.

Speaker: 1014

01:17:44,214 --> 01:17:49,218

But with HSGP, it becomes way, way faster and way more efficient.

Speaker: 1015

01:17:49,218 --> 01:17:52,761

Now, you cannot always use HSGP, there are caveats and so on.

Speaker: 1016

01:17:52,761 --> 01:17:57,264

But Bill and I have been working on this tutorial.

Speaker: 1017

01:17:57,264 --> 01:17:58,925

It's going to be in two parts.

Speaker: 1018

01:17:59,466 --> 01:18:02,127

The first part was out today.

Speaker: 1019

01:18:02,228 --> 01:18:09,823

And I'm going to send you the links for the show notes here in the chat we have.

Speaker: 1020

01:18:09,847 --> 01:18:11,518

That's up on the PMC website.

Speaker: 1021

01:18:11,518 --> 01:18:17,981

If you go, that's HSGP First Steps and Reference.

Speaker: 1022

01:18:18,362 --> 01:18:26,667

And we go through why you would use HSGP, how you would use it in PMC, and the basic use

cases.

Speaker: 1023

01:18:26,667 --> 01:18:31,629

And the second part is going to be the more advanced use cases.

Speaker: 1024

01:18:31,670 --> 01:18:38,133

Bill and I have started working on that, but it's always taking time to develop good

content.

Speaker: 1025

01:18:38,638 --> 01:18:39,438

on that front.

Speaker: 1026

01:18:39,438 --> 01:18:46,363

yeah, we're getting there and it's open source, so we're doing that on our free time

unpaid.

Speaker: 1027

01:18:47,105 --> 01:18:49,666

that always takes a bit more time.

Speaker: 1028

01:18:49,687 --> 01:18:52,149

But we'll get there.

Speaker: 1029

01:18:52,149 --> 01:19:07,901

And finally, another resource that I think your listeners are going to appreciate is I'm

doing a webinar series on HSGP where we have a modeler that

Speaker: 1030

01:19:08,073 --> 01:19:14,336

who comes on the show and shares our screen and does live coding.

Speaker: 1031

01:19:14,717 --> 01:19:16,168

so the first part is out already.

Speaker: 1032

01:19:16,168 --> 01:19:18,859

I'm going to send you that for the show notes.

Speaker: 1033

01:19:18,859 --> 01:19:37,483

had Juan Orduz on the show and he went into like, the first part of how to do HSGPs and

what are even HSGPs from a mathematical point of view because Juan is a mathematician.

Speaker: 1034

01:19:37,483 --> 01:19:48,699

So yeah, like I'll end by my very long, passionate rant about about GPS here, but long

story short, GPS are amazing.

Speaker: 1035

01:19:49,180 --> 01:19:54,262

And that's a good investment of your time to be skillful with GPS.

Speaker: 1036

01:19:54,443 --> 01:19:55,583

Fantastic.

Speaker: 1037

01:19:55,583 --> 01:19:59,265

Another area that I would love to be able to dig deep into.

Speaker: 1038

01:19:59,466 --> 01:20:06,079

And so our lucky listeners out there who have the time will now be able to dig into that

resource and many of the others that you have.

Speaker: 1039

01:20:06,081 --> 01:20:10,062

suggested in this episode, which we've all got for you in the show notes.

Speaker: 1040

01:20:10,062 --> 01:20:11,883

Thank you so much.

Speaker: 1041

01:20:11,883 --> 01:20:13,703

Alex, this has been an amazing episode.

Speaker: 1042

01:20:13,703 --> 01:20:22,025

Before I let my guests go, I always ask for book recommendation and you've had some

already for us in this episode, but I wonder if there's anything else.

Speaker: 1043

01:20:22,225 --> 01:20:27,037

you had, the recommendation you already had was Bernoulli is something about Bernoulli.

Speaker: 1044

01:20:27,037 --> 01:20:27,557

Right.

Speaker: 1045

01:20:27,557 --> 01:20:34,949

Bernoulli's fallacy, the crises, the crisis of modern science and the logic of science,

think.

Speaker: 1046

01:20:35,269 --> 01:20:35,969

yeah.

Speaker: 1047

01:20:35,969 --> 01:20:46,182

Yeah, I'll send you that actually, the episode with Aubrey and David Spiegelhalter,

because these are really good, especially for less technical people, but who are curious

Speaker: 1048

01:20:46,182 --> 01:20:47,972

about science and how that works.

Speaker: 1049

01:20:47,972 --> 01:20:50,233

think it's very good entry point.

Speaker: 1050

01:20:50,373 --> 01:20:54,274

Yeah, so this book is amazing.

Speaker: 1051

01:20:55,164 --> 01:20:56,975

my God, this is an extremely hard question.

Speaker: 1052

01:20:56,975 --> 01:21:02,696

I love so many books and I read so many books that I'm taken aback.

Speaker: 1053

01:21:02,696 --> 01:21:05,919

So books, I would say books which...

Speaker: 1054

01:21:05,919 --> 01:21:10,932

I find extremely good and have influenced me because a book is also like, it's not only

the book, right?

Speaker: 1055

01:21:10,932 --> 01:21:14,733

It's also the moment when you read the book.

Speaker: 1056

01:21:15,434 --> 01:21:24,139

So like, yeah, if you like a book and you come back to it later, you'll have a different

experience because you are a different person and you have different skills.

Speaker: 1057

01:21:24,139 --> 01:21:31,122

And yeah, so I'm going to cheat and give you several recommendations because I have too

many of them.

Speaker: 1058

01:21:31,663 --> 01:21:35,891

so technical books, I would say

Speaker: 1059

01:21:35,891 --> 01:21:37,521

The Logic of Science by E .T.

Speaker: 1060

01:21:37,521 --> 01:21:38,402

Jains.

Speaker: 1061

01:21:38,402 --> 01:21:39,082

E .T.

Speaker: 1062

01:21:39,082 --> 01:21:45,474

Jains is an old mathematician scientist, it's like in the Bayesian world, E .T.

Speaker: 1063

01:21:45,474 --> 01:21:47,784

Jains is like a rock star.

Speaker: 1064

01:21:47,984 --> 01:21:53,866

Definitely recommend his masterpiece, The Logic of Science.

Speaker: 1065

01:21:53,866 --> 01:21:57,667

That's a technical book, but that's actually a very readable book.

Speaker: 1066

01:21:57,667 --> 01:22:00,768

And that's also very epistemological.

Speaker: 1067

01:22:00,768 --> 01:22:03,128

So that one is awesome.

Speaker: 1068

01:22:03,128 --> 01:22:05,645

Much more applied if you want to learn.

Speaker: 1069

01:22:05,645 --> 01:22:06,565

patient stats.

Speaker: 1070

01:22:06,565 --> 01:22:11,965

A great book to do that, Statistical Rethinking by Richard Michael Rath.

Speaker: 1071

01:22:11,965 --> 01:22:13,605

Really great book.

Speaker: 1072

01:22:13,725 --> 01:22:15,385

I've read it several times.

Speaker: 1073

01:22:15,385 --> 01:22:20,025

Any book by Andrew Gellman, as you were saying, definitely recommend them.

Speaker: 1074

01:22:20,025 --> 01:22:22,525

They tend to be a bit more advanced.

Speaker: 1075

01:22:22,525 --> 01:22:28,645

If you want a really beginner's one, his last one, actually Active Statistics, he's a

really good one.

Speaker: 1076

01:22:28,645 --> 01:22:31,825

I just had him on the show, episode 106.

Speaker: 1077

01:22:31,825 --> 01:22:32,645

John?

Speaker: 1078

01:22:34,217 --> 01:22:38,199

for people who like numbers, say like that.

Speaker: 1079

01:22:39,620 --> 01:22:52,575

And I remember that when I was studying political science, Barack Obama's book from before

he was president, I don't remember the name.

Speaker: 1080

01:22:52,575 --> 01:22:56,687

I think it's the Audacity of Hope, I'm not sure.

Speaker: 1081

01:22:56,747 --> 01:23:02,423

But his first book before he became president, that was actually a very interesting.

Speaker: 1082

01:23:02,423 --> 01:23:03,893

Dreams from my father?

Speaker: 1083

01:23:03,893 --> 01:23:04,314

Yes.

Speaker: 1084

01:23:04,314 --> 01:23:05,084

Yeah, yeah, this one.

Speaker: 1085

01:23:05,084 --> 01:23:06,064

Dreams from my father.

Speaker: 1086

01:23:06,064 --> 01:23:07,244

Yeah, yeah, yeah.

Speaker: 1087

01:23:07,244 --> 01:23:07,935

Very interesting one.

Speaker: 1088

01:23:07,935 --> 01:23:10,155

The other ones were a bit more political.

Speaker: 1089

01:23:10,155 --> 01:23:11,696

I found a bit less interesting.

Speaker: 1090

01:23:11,696 --> 01:23:14,936

But this one was really interesting to me.

Speaker: 1091

01:23:15,277 --> 01:23:19,218

And another one for people who are very nerdy.

Speaker: 1092

01:23:19,218 --> 01:23:20,908

So I'm a very nerdy person.

Speaker: 1093

01:23:20,908 --> 01:23:24,519

I love going to the gym, for instance.

Speaker: 1094

01:23:24,519 --> 01:23:28,380

I have my, I can do my own training plan, my own nutritional plan.

Speaker: 1095

01:23:28,380 --> 01:23:30,571

I've dug into that research.

Speaker: 1096

01:23:30,571 --> 01:23:31,441

I love that.

Speaker: 1097

01:23:31,441 --> 01:23:33,103

because I love sports also.

Speaker: 1098

01:23:33,103 --> 01:23:45,553

Another very good book I definitely recommend to develop good habits is Kathy Milkman's

How to Change, the Science of Getting from Where You Are to Where You Wanna Go.

Speaker: 1099

01:23:45,553 --> 01:23:49,476

Extremely good book full of very practical tips.

Speaker: 1100

01:23:50,737 --> 01:23:54,040

Yeah, that's an extremely good one.

Speaker: 1101

01:23:54,040 --> 01:23:57,993

And then a last one that I read recently.

Speaker: 1102

01:23:58,084 --> 01:23:59,804

no, actually two last ones.

Speaker: 1103

01:23:59,935 --> 01:24:00,805

All right.

Speaker: 1104

01:24:00,805 --> 01:24:01,666

Yeah.

Speaker: 1105

01:24:01,666 --> 01:24:03,266

One last two.

Speaker: 1106

01:24:03,607 --> 01:24:05,647

Penultimate, I think you say.

Speaker: 1107

01:24:05,647 --> 01:24:12,890

How Minds Change by David McCraney for people who are interested in how beliefs are

formed.

Speaker: 1108

01:24:12,991 --> 01:24:14,371

Extremely interesting.

Speaker: 1109

01:24:15,071 --> 01:24:15,902

he's a journalist.

Speaker: 1110

01:24:15,902 --> 01:24:20,534

He's got a fantastic podcast that's called You're Not So Smart.

Speaker: 1111

01:24:20,814 --> 01:24:22,915

And I definitely recommend that one.

Speaker: 1112

01:24:23,275 --> 01:24:28,457

And yeah, that's how people change their mind, basically, because I'm very interested in

that.

Speaker: 1113

01:24:28,615 --> 01:24:34,527

And in the end, is a, like, it's a trove of wisdom, this book.

Speaker: 1114

01:24:35,008 --> 01:24:37,949

And very, like, last one, Promise.

Speaker: 1115

01:24:39,390 --> 01:24:46,452

I also am extremely passionate about Stoicism, Stoic philosophy.

Speaker: 1116

01:24:46,673 --> 01:24:56,376

And that's a philosophy I find extremely helpful to live my life and navigate the

difficulties that we all have in life.

Speaker: 1117

01:24:56,657 --> 01:24:57,497

And

Speaker: 1118

01:24:58,217 --> 01:25:12,606

a very iconic book in these is Meditations from Marcus Aurelius, reading the thoughts of a

Roman emperor, one of the best Roman emperor there was.

Speaker: 1119

01:25:13,087 --> 01:25:19,511

It's really fascinating because he didn't treat that to be published.

Speaker: 1120

01:25:19,652 --> 01:25:21,673

It was his journal, basically.

Speaker: 1121

01:25:21,673 --> 01:25:28,077

That's absolutely fascinating to read that and to see that they kind of had the same

issues we still have.

Speaker: 1122

01:25:28,077 --> 01:25:30,057

You know, so that's yeah, fantastic.

Speaker: 1123

01:25:30,057 --> 01:25:31,857

It's my, I read it very often.

Speaker: 1124

01:25:31,857 --> 01:25:32,377

Yeah.

Speaker: 1125

01:25:32,377 --> 01:25:37,607

I haven't actually read meditations, but I read Ryan holidays, the daily stoic.

Speaker: 1126

01:25:37,607 --> 01:25:38,077

Yeah.

Speaker: 1127

01:25:38,077 --> 01:25:38,467

yeah.

Speaker: 1128

01:25:38,467 --> 01:25:38,997

Yeah.

Speaker: 1129

01:25:38,997 --> 01:25:39,637

Yeah.

Speaker: 1130

01:25:39,637 --> 01:25:40,917

And that's really good.

Speaker: 1131

01:25:40,917 --> 01:25:47,467

it's, yeah, 366 daily meditations on wisdom, perseverance, and the art of living and based

on Stoke philosophy.

Speaker: 1132

01:25:47,467 --> 01:25:49,777

And so there is a lot from Marcus or really soon there.

Speaker: 1133

01:25:49,777 --> 01:25:57,893

He's probably the plurality of content and wow, it is.

Speaker: 1134

01:25:58,701 --> 01:26:07,741

It is mind blowing to me how somebody to millennia ago is the same as me.

Speaker: 1135

01:26:07,741 --> 01:26:10,801

mean, I mean, that's holding myself in.

Speaker: 1136

01:26:10,801 --> 01:26:12,901

I'm not a Roman emperor.

Speaker: 1137

01:26:13,721 --> 01:26:18,461

the things I write will not be studied 2000 years from now.

Speaker: 1138

01:26:18,461 --> 01:26:27,621

But nevertheless, the connection you feel with this individual from 2000 years ago and the

problems that he's facing and how similar they are.

Speaker: 1139

01:26:27,689 --> 01:26:30,050

the problems that I face every day, it's staggering.

Speaker: 1140

01:26:30,491 --> 01:26:30,951

Yeah.

Speaker: 1141

01:26:30,951 --> 01:26:31,201

Yeah.

Speaker: 1142

01:26:31,201 --> 01:26:31,392

Yeah.

Speaker: 1143

01:26:31,392 --> 01:26:32,482

No, that's incredible.

Speaker: 1144

01:26:32,482 --> 01:26:48,633

Me, something that really talked to me was, well, that I remember is at some point he's

saying to himself that it's no use to go to the countryside to escape everything because

Speaker: 1145

01:26:48,633 --> 01:26:50,975

the real retreat is in yourself.

Speaker: 1146

01:26:50,975 --> 01:26:57,369

It's like, if you're not able to be calm and

Speaker: 1147

01:26:57,751 --> 01:27:00,842

find equanimity in your daily life.

Speaker: 1148

01:27:00,842 --> 01:27:06,845

It's not because you're going to get away from the city and Rome was like the megalopolis

at the time.

Speaker: 1149

01:27:06,845 --> 01:27:12,597

It's not because you get away from the city that you're going to find tranquility over

there.

Speaker: 1150

01:27:12,597 --> 01:27:14,518

You have to find tranquility inside.

Speaker: 1151

01:27:14,518 --> 01:27:18,989

then, yeah, you'll go to the country that is going to be even more awesome.

Speaker: 1152

01:27:19,030 --> 01:27:22,361

But it's not because you go there that you find tranquility.

Speaker: 1153

01:27:22,361 --> 01:27:27,425

And that was super interesting to me because I was like, wait, because I definitely feel

that when I'm in a

Speaker: 1154

01:27:27,425 --> 01:27:30,257

big, big metropolis.

Speaker: 1155

01:27:30,257 --> 01:27:31,848

At some point, I want to get away.

Speaker: 1156

01:27:31,848 --> 01:27:38,133

But I was like, wait, we're leaving that already at that time where they didn't have

internet, they didn't have cars and so on.

Speaker: 1157

01:27:38,133 --> 01:27:43,817

But for them, it was already something that was too many people, too much noise.

Speaker: 1158

01:27:43,857 --> 01:27:45,499

I found that super interesting.

Speaker: 1159

01:27:45,499 --> 01:27:46,279

For sure.

Speaker: 1160

01:27:46,279 --> 01:27:47,379

Wild.

Speaker: 1161

01:27:47,680 --> 01:27:50,122

Well, this has been an amazing episode, Alex.

Speaker: 1162

01:27:50,122 --> 01:27:54,583

I really am glad that Doug suggested you for the show because...

Speaker: 1163

01:27:54,583 --> 01:27:55,883

This has been fantastic.

Speaker: 1164

01:27:55,883 --> 01:27:57,874

I've really enjoyed every minute of this.

Speaker: 1165

01:27:57,874 --> 01:28:03,695

I wish it could go on forever, but sadly all good things must come to an end.

Speaker: 1166

01:28:04,296 --> 01:28:10,427

And so before I let you go, very last thing is do you have other places that we should be

following you?

Speaker: 1167

01:28:10,427 --> 01:28:14,358

We're going to have a library of links in the show notes for this episode.

Speaker: 1168

01:28:14,499 --> 01:28:20,940

And of course we know about your podcast, learn Bayesian statistics.

Speaker: 1169

01:28:21,100 --> 01:28:23,721

We've got the intuitive Bayes.

Speaker: 1170

01:28:23,753 --> 01:28:29,437

educational platform and open source libraries like PyMC and RViz.

Speaker: 1171

01:28:29,877 --> 01:28:36,382

In addition to those, is there any other social media platform or other way that people

should be following you or getting in touch with you after the program?

Speaker: 1172

01:28:36,382 --> 01:28:39,274

Well, yeah, thanks for mentioning that.

Speaker: 1173

01:28:39,274 --> 01:28:45,908

So yeah, intuitive base, learning vision statistics, PyMC Labs, you mentioned them.

Speaker: 1174

01:28:46,229 --> 01:28:52,929

And I'm always available on Twitter, alex underscore andora.

Speaker: 1175

01:28:52,929 --> 01:28:57,091

like the country and that's where it comes from.

Speaker: 1176

01:28:59,472 --> 01:29:02,383

Because it has two Rs and not only one.

Speaker: 1177

01:29:02,383 --> 01:29:08,115

And so when I say it in another language than Spanish, then people write it with just one

Speaker: 1178

01:29:09,616 --> 01:29:12,507

otherwise, LinkedIn also, I'm over there.

Speaker: 1179

01:29:12,507 --> 01:29:16,999

So you can always reach out to me over there, LinkedIn or Twitter.

Speaker: 1180

01:29:16,999 --> 01:29:20,241

also, yes, me podcast suggestions, stuff like that.

Speaker: 1181

01:29:20,241 --> 01:29:21,201

always...

Speaker: 1182

01:29:21,579 --> 01:29:23,131

the lookout for something cool.

Speaker: 1183

01:29:23,131 --> 01:29:26,997

again, yeah, thanks a lot, for having me on.

Speaker: 1184

01:29:26,997 --> 01:29:30,021

Thanks a lot, Doug, for the recommendation.

Speaker: 1185

01:29:30,482 --> 01:29:32,285

yeah, that was a blast.

Speaker: 1186

01:29:32,285 --> 01:29:33,988

I enjoyed it a lot.

Speaker: 1187

01:29:33,988 --> 01:29:35,589

So thank you so much.

Speaker: 1188

01:29:39,277 --> 01:29:42,980

This has been another episode of Learning Bayesian Statistics.

Speaker: 1189

01:29:42,980 --> 01:29:53,469

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats .com for more resources about today's topics, as well as access to more

Speaker: 1190

01:29:53,469 --> 01:29:57,552

episodes to help you reach true Bayesian state of mind.

Speaker: 1191

01:29:57,552 --> 01:29:59,514

That's learnbaystats .com.

Speaker: 1192

01:29:59,514 --> 01:30:02,376

Our theme music is Good Bayesian by Baba Brinkman.

Speaker: 1193

01:30:02,376 --> 01:30:04,358

Fit MC Lance and Meghiraan.

Speaker: 1194

01:30:04,358 --> 01:30:07,520

Check out his awesome work at bababrinkman .com.

Speaker: 1195

01:30:07,520 --> 01:30:08,715

I'm your host.

Speaker: 1196

01:30:08,715 --> 01:30:09,686

Alex Andorra.

Speaker: 1197

01:30:09,686 --> 01:30:13,869

can follow me on Twitter at Alex underscore Andorra, like the country.

Speaker: 1198

01:30:13,869 --> 01:30:21,174

You can support the show and unlock exclusive benefits by visiting Patreon .com slash

LearnBasedDance.

Speaker: 1199

01:30:21,174 --> 01:30:23,556

Thank you so much for listening and for your support.

Speaker: 1200

01:30:23,556 --> 01:30:25,858

You're truly a good Bayesian.

Speaker: 1201

01:30:25,858 --> 01:30:29,360

Change your predictions after taking information in.

Speaker: 1202

01:30:29,360 --> 01:30:36,033

And if you're thinking I'll be less than amazing, let's adjust those expectations.

Speaker: 1203

01:30:36,033 --> 01:30:49,169

Let me show you how to be a good Bayesian Change calculations after taking fresh data in

Those predictions that your brain is making Let's get them on a solid foundation

Transcript

Sign up for our newsletter!

The latest from Reverend Bayes directly in your inbox!

QUICK Links

Get in Touch