Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!
Visit our Patreon page to unlock exclusive Bayesian swag 😉
Takeaways:
- Bayesian statistics is a powerful framework for handling complex problems, making use of prior knowledge, and excelling with limited data.
- Bayesian statistics provides a framework for updating beliefs and making predictions based on prior knowledge and observed data.
- Bayesian methods allow for the explicit incorporation of prior assumptions, which can provide structure and improve the reliability of the analysis.
- There are several Bayesian frameworks available, such as PyMC, Stan, and Bambi, each with its own strengths and features.
- PyMC is a powerful library for Bayesian modeling that allows for flexible and efficient computation.
- For beginners, it is recommended to start with introductory courses or resources that provide a step-by-step approach to learning Bayesian statistics.
- PyTensor leverages GPU acceleration and complex graph optimizations to improve the performance and scalability of Bayesian models.
- ArviZ is a library for post-modeling workflows in Bayesian statistics, providing tools for model diagnostics and result visualization.
- Gaussian processes are versatile non-parametric models that can be used for spatial and temporal data analysis in Bayesian statistics.
Chapters:
00:00 Introduction to Bayesian Statistics
07:32 Advantages of Bayesian Methods
16:22 Incorporating Priors in Models
23:26 Modeling Causal Relationships
30:03 Introduction to PyMC, Stan, and Bambi
34:30 Choosing the Right Bayesian Framework
39:20 Getting Started with Bayesian Statistics
44:39 Understanding Bayesian Statistics and PyMC
49:01 Leveraging PyTensor for Improved Performance and Scalability
01:02:37 Exploring Post-Modeling Workflows with ArviZ
01:08:30 The Power of Gaussian Processes in Bayesian Modeling
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.
Links from the show:
- Original episode on the Super Data Science podcast: https://www.superdatascience.com/podcast/bayesian-methods-and-applications-with-alexandre-andorra
- Advanced Regression with Bambi and PyMC: https://www.intuitivebayes.com/advanced-regression
- Gaussian Processes: HSGP Reference & First Steps: https://www.pymc.io/projects/examples/en/latest/gaussian_processes/HSGP-Basic.html
- Modeling Webinar – Fast & Efficient Gaussian Processes: https://www.youtube.com/watch?v=9tDMouGue8g
- Modeling spatial data with Gaussian processes in PyMC: https://www.pymc-labs.com/blog-posts/spatial-gaussian-process-01/
- Hierarchical Bayesian Modeling of Survey Data with Post-stratification: https://www.pymc-labs.com/blog-posts/2022-12-08-Salk/
- PyMC docs: https://www.pymc.io/welcome.html
- Bambi docs: https://bambinos.github.io/bambi/
- PyMC Labs: https://www.pymc-labs.com/
- LBS #50 Ta(l)king Risks & Embracing Uncertainty, with David Spiegelhalter: https://learnbayesstats.com/episode/50-talking-risks-embracing-uncertainty-david-spiegelhalter/
- LBS #51 Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton: https://learnbayesstats.com/episode/51-bernoullis-fallacy-crisis-modern-science-aubrey-clayton/
- LBS #63 Media Mix Models & Bayes for Marketing, with Luciano Paz: https://learnbayesstats.com/episode/63-media-mix-models-bayes-marketing-luciano-paz/
- LBS #83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo: https://learnbayesstats.com/episode/83-multilevel-regression-post-stratification-electoral-dynamics-tarmo-juristo/
- Jon Krohn on YouTube: https://www.youtube.com/JonKrohnLearns
- Jon Krohn on Linkedin: https://www.linkedin.com/in/jonkrohn/
- Jon Krohn on Twitter: https://x.com/JonKrohnLearns
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.
Transcript
In this special episode, the roles are reversed.
2
as I step into the guest seat to explore the intriguing world of Bayesian stats.
3
Originally aired as episode 793 on the fantastic Super Data Science podcast hosted by John
Crone, this conversation is too good not to share with all of you here on learning
4
Bayesian statistics.
5
So join us as we delve into how Bayesian methods elegantly handle complex problems, make
efficient use of prior knowledge and excel with limited data.
6
the foundational concepts of patient statistics, highlighting their distinct advantages
over traditional methods, particularly in scenarios fraught.
7
with uncertainty and sparse data.
8
A highlight of our discussion is the application of Gaussian processes where I explain
their versatility in modeling complex, non -linear relationships in data.
9
I share a fascinating case study involving an NGO in Estonia illustrating how Bayesian
approaches can transform limited polling data into profound insights.
10
So whether you're a seasoned statistician or just starting out, this episode is packed
with practical advice on embracing Bayesian stats and of course
11
I strongly recommend you follow the Super Data Science Podcast.
12
It's really a can't -miss resource for anyone passionate about the power of data.
13
This is Learning Vision Statistics, episode 113, originally aired on the Super Data
Science Podcast.
14
Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,
the projects, and the people who make it possible.
15
I'm your host, Alex Andorra.
16
You can follow me on Twitter at alex -underscore -andorra.
17
like the country.
18
For any info about the show, learnbasedats .com is Laplace to be.
19
Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on
Patreon, everything is in there.
20
That's learnbasedats .com.
21
If you're interested in one -on -one mentorship, online courses, or statistical
consulting, feel free to reach out and book a call at topmate .io slash alex underscore
22
and dora.
23
See you around, folks.
24
and best patient wishes to you all.
25
And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can
help bring them to life.
26
Check us out at pimc -labs .com.
27
Hello my dear Vagans!
28
A quick note before today's episode, STANCON 2024 is approaching!
29
It's in Oxford, UK this year from September 9 to 13 and it's shaping up to be an
incredible event for anybody interested in statistical modeling and vaginal inference.
30
Actually, we're currently looking for sponsors to help us offer more scholarships and make
STANCON more accessible to everyone and we also encourage you
31
to buy your tickets as soon as possible.
32
Not only will this help with making a better conference, but this will also support our
scholarship fund.
33
For more details on tickets, sponsorships, or community involvement, you'll find the
Stencon website in the show notes.
34
We're counting on you.
35
Okay, on to the show now.
36
Alex, welcome to the Super Data Science podcast.
37
I'm delighted to have you here.
38
Such an experienced podcaster.
39
It's going to be probably fun for you to get to be the guest on the show today.
40
Yeah.
41
Thank you, John.
42
First, thanks a lot for having me on.
43
I knew about your podcast.
44
I was both honored and delighted when I got your email to come on the show.
45
I know you have had very...
46
Honorable guests before like Thomas Vicky.
47
so I will try to, to, to be on board, but, I know that it's going to be hard.
48
Yeah.
49
Thomas, your co -founder at, Pi MC labs is, was indeed a guest.
50
was on episode number 585.
51
but that is not what brought you here.
52
Interestingly, the connection.
53
So you asked me before we started recording how I knew about you.
54
And so a listener actually suggested to you as a guest.
55
So.
56
Doug McLean.
57
Thank you for the suggestion.
58
Doug is lead data scientist at Tesco bank in the UK.
59
And he reached out to me and said, can I make a suggestion for a guest?
60
Alex Andora, like the country, I guess you say that you say that.
61
Cause he put it in quotes.
62
He's like, Andora, like the country hosts the learning patient statistics podcast.
63
It's my other all time favorite podcast.
64
So there you go.
65
my God.
66
Doug, I'm blushing.
67
says he'd be a fab guest for your show and not least because he moans from time to time
about not getting invited onto other podcasts.
68
Did I?
69
my God.
70
I don't remember.
71
But maybe that was part of a secret plan, Maybe a secret marketing LBS plan and well.
72
That works perfectly.
73
When I read that, I immediately reached out to you to see if you'd want to go, but that
was so funny.
74
And he does say, says, seriously though, he'd make a fab guest for his wealth of knowledge
on data science and on Bayesian statistics.
75
And so, yes, we will be digging deep into Bayesian statistics with you today.
76
you're the co -founder and principal data scientist of the popular Bayesian statistical
modeling platform, PI MC, as we already talked about with your co -founder Thomas Wiki.
77
That is an excellent episode.
78
If you want to go back to that and get.
79
different perspective, obviously different questions we've made sure.
80
But so if you're really interested in Bayesian statistics, that is a great one to go back
to.
81
Yeah, in addition to that, you obviously also have the Learning Bayesian Stats podcast,
which we just talked about, and you're an instructor on the educational site, Intuitive
82
Bayes.
83
So tons of Bayesian experience.
84
Alex, through this work, tell us about what
85
Bayesian methods are and what makes them so powerful and versatile?
86
Yeah.
87
so first, thanks a lot.
88
Thanks a lot, dog, for the recommendation and for listening to the show.
89
am, I am absolutely honored.
90
and, yeah, go and listen again to Thomas's episode.
91
Thomas is always a great guest.
92
So I definitely recommend anybody to, to go and listen to him.
93
now what about Bayes?
94
Yeah.
95
You know, it's been a long time since someone has asked me that, because I have a Bayesian
podcast.
96
Usually it's quite clear I'm doing that.
97
people are like afraid to ask it at some point.
98
So instead of giving you kind of like a, because our two avenues here, usually I could
give you the philosophical answer and why epistemologically Bayes stats makes more sense.
99
but I'm not going to do that.
100
That sounds so interesting.
101
Yeah, it is, we can go into that.
102
I think a better introduction is just a practical one.
103
And that's the one that most people get to know at some point, which is like you're
working on something and you're interested in uncertainty estimation and not only in the
104
point estimates and your data are crap and you don't have a lot of them and they are not
reliable.
105
What do you do?
106
And that happens to a lot of PhD students.
107
That happened to me when I started trying to do electoral forecasting.
108
was at the time working at the French Central Bank doing something completely different
from what I'm doing today.
109
But I was writing a book about the US at the time, 2016 it was, and it was a pretty
consequential election for the US.
110
was following it.
111
really, really closely.
112
And I remember it was July 2016 when I discovered 538 models.
113
And then the nerd in me was awoken.
114
It was like, my God, this is what I need to do.
115
You know, that's my way of putting more science into political science, which was my
background at the time.
116
And when you do electoral forecasting polls are extremely noisy.
117
They are not a good representation of what people think, but they are the best ones we
have.
118
are not a lot of them, at least in France, in the US much more.
119
It's limited.
120
It's not a reliable source of data basically.
121
And you also have a lot of domain knowledge, which in the Bayesian Royal realm, we call
prior information.
122
And so that's a perfect setup for Bayesian stats.
123
So that's basically, I would say what Bayesian stats is.
124
And that's the powerful, the power of it.
125
You don't have to rely only on the data because sure you can let the data speak for
themselves, but what if the data are unreliable?
126
Then you need something to guard against that and patient stats are a great way of doing
that.
127
And the cool thing is that it's a method.
128
It's like you can apply that to any topic you want, any field you want.
129
that's what...
130
I've done at PMC Labs for a few years now with all the brilliant guys who are over there.
131
You can do that for marketing, for electoral forecasting, of course.
132
Agriculture, that was quite ironic when we got some agricultural clients because
historically, agriculture is like the field of frequency statistics.
133
That's how Ronald Fisher developed the p -value, the famous one.
134
So when we had that, we're like, yes, we got our revenge.
135
And of course, it's also used a lot in sports, sports modeling, things like that.
136
So yeah, it's like that's the practical introduction.
137
Nice.
138
Yeah.
139
A little bit of interesting history there is that sub -Asian statistics is an older
approach than the frequentist statistics that is so common and is the standard that is
140
taught in college, so much so
141
that is just called statistics.
142
You do an entire undergrad in statistics and not even hear the word Bayesian because
Fisher so decidedly created this monopoly of this one kind of approach, which for me,
143
learning for Quenta statistics say, I think I guess it was first year undergrad in science
that I studied and
144
in that first year course, that idea of a P value always seemed odd to me.
145
Like how is it that there's this art?
146
This is such an arbitrary threshold of significance to have it be, you know, that this is
a one in 20 chance or less that this would be observed by chance alone.
147
And this means that therefore we should rely on it, especially as we are in this era of
large data sets and larger and larger and larger data sets.
148
You can have no meaningful if with very large data sets like we typically deal with today,
no matter, you're always going to get a significant P value because the slightest tiny
149
change, if you take, you know, web scale data, everything's going to have be statistically
significant.
150
Nothing won't be.
151
so it's such a weird paradigm.
152
And so that was discovering Bayesian statistics and machine learning as well.
153
And seeing how
154
Those areas didn't have P values interested me in both of those things.
155
it's a, yeah, Fisher.
156
It's interesting.
157
mean, I guess with small data sets, eight, 16, that kind of scale, guess it kind of made
some sense.
158
And you know, you pointed out there, I think it's this prior that makes Bayesian
statistics so powerful being able to incorporate prior knowledge, but simultaneously
159
that's also what makes for Quentus uncomfortable.
160
They they're like, we want only the data.
161
As though, you know, the particular data that you collect and the experimental design,
there are so many ways that you as the human are influencing, you know, there's no purity
162
of data anyway.
163
And so priors are a really elegant way to be able to adjust the model in order to point it
in the right direction.
164
And so a really good example that I like to come to with Bayesian statistics is that you
can
165
You can allow some of your variables in the model to tend towards wider variance or
narrower variance.
166
So if there are some attributes of your model where you're very confident, where you know
this is like, you know, this is like a physical fact of the universe.
167
Let's just have a really narrow variance on this and the model won't be able to diverge
much there.
168
But that then gives a strong focal point within the model.
169
around which the other data can make more sense, the other features can make more sense,
and you can allow those other features to have wider variance.
170
And so, I don't know, this is just one example that I try to give people when they're not
sure about being able to incorporate prior knowledge into a model.
171
Yeah, yeah, no, these are fantastic points, John.
172
So, yeah, to build a net, I'm...
173
I'm a bit, of course, I'm a nerd.
174
So I love the history of science.
175
I love the epistemological side.
176
A very good book on that is Bernoulli's Fallacy by Aubrey Clayton.
177
Definitely recommend his book.
178
He was on my podcast, episode 51.
179
So if people want to give that a listen.
180
Did you just pull that 51 out from memory?
181
Yeah, yeah, I kind of know like, but I have less episodes than you.
182
So it's like, you know, each episode is like kind of my baby.
183
So I'm like, yeah, 51 is Aubrey Clayton.
184
Yeah.
185
my goodness.
186
That's crazy.
187
That's also how my brain works.
188
numbers.
189
But yeah.
190
And actually episode 50 was with Sir David Spiegel Halter.
191
I think the only night we got on the podcast and David Spiegel Halter exceptional.
192
exceptional guest, very, very good pedagogically.
193
Definitely recommend listening to that episode two, which is very epistemologically heavy
so far for people who like that, the history of science, how we got there.
194
Because as you were saying, is actually older than Stantz, but people discovered later.
195
So it's not because it's older, that's better, right?
196
But it is way older actually by a few centuries.
197
So yeah, fun stories here.
198
could talk about that still, but to get back to what you were saying, also as you were
very eloquently saying, data can definitely be biased.
199
Because that idea of like, no, we only want the data to speak for themselves.
200
as I was saying, yeah, what if the data are unreliable?
201
But as you were saying, what if the data are biased?
202
And that happens all the time.
203
And worse.
204
I would say these biases are most of the time implicit in the sense that either they are
hidden or most of the time they just like you don't even know you are biased in some
205
direction most of the time because it's a result of your education and your environment.
206
So the good thing of priors is that it forces your assumptions, your hidden assumptions to
be explicit.
207
And that I think is very interesting also, especially when you work on models which are
supposed to have a causal explanation and which are not physical models, but more social
208
models or political scientific models.
209
Well, then it's really interesting to see how two people can have different conclusions
based on the same data.
210
It's because they have different priors.
211
And if you force them to explicit these priors in their models, they would definitely have
different priors.
212
then...
213
then you can have a more interesting discussion actually, think.
214
So there's that.
215
And then I think the last point that's interesting also in that, like why would you be
interested in this framework is that also, causes are not in the data.
216
Causes are outside of the data.
217
The causal relation between X and Y, you're not going to see it in the data because if you
do a regression of
218
education on income, you're going to see an effect of education on income.
219
But you as a human, you know that if you're looking at one person, the effect has to be
education has an impact on income.
220
But the computer could like might as well just do the other regression and regress income
and education and tell you, income causes education.
221
But no, it's not going that way.
222
the statistical relationship goes both ways, but the causal one
223
only goes one direction.
224
And that's a hidden reference to my favorite music band.
225
But yeah, it only goes one direction, and it's not in the data.
226
And you have to have a model for that.
227
And a model is just a simplification of reality.
228
We try to get the simple enough model that's usually not simple, but that's a
simplification.
229
If you say it's a construction and simplification, that's already a prior in a way.
230
you you might as well just go all the way and explicit all your priors.
231
Well said.
232
Very interesting discussion there.
233
You used a term a number of times already in today's podcast, which maybe is not known to
all of our listeners.
234
is epistemology?
235
What does that mean?
236
right.
237
Yeah, very good question.
238
Yeah.
239
So epistemology in like, in a sense that's the science of science.
240
It's understanding
241
how we know what we say we know.
242
So, for instance, how do we know the earth is round?
243
How do we know about relativity?
244
Things like that.
245
it's the scientific discipline that's actually very close to also philosophy.
246
That's, think, actually a branch of philosophy.
247
And that's trying to...
248
come up with methods to understand how we can come up with new scientific knowledge.
249
And by scientific here, we usually mean reliable and reproducible, but also falsifiable.
250
Because for hypothesis to be scientific, it has to be falsifiable.
251
so yeah, basically that's that.
252
Lots of extremely interesting things here, but yeah, that's like basically how do we know?
253
what we know, that's the whole trying to define the scientific method and things like
that.
254
Going off on a little bit of a tangent here, but it's interesting to me how I think among
non -scientists lay people in the public.
255
Science is often seen to be infallible, as though science is real.
256
Science is the truth.
257
There's a lot of, since that 2016 election, there are lots of, people have lawn signs in
the US that say, that basically have a list of liberal values, most of which
258
I'm a huge fan of.
259
And of course I like the sentiment, this idea, you know, that they're supporting science
on this sign, on the sign as well.
260
But it says the sign, the way that they phrase it is science is real.
261
And the implication there for me, every time I see the sign is that, you know, and I think
that could be, for example, related to vaccines, I think, you know, around, you know,
262
there was a lot of conflict around vaccines and what their real purpose is and, you know,
and then so.
263
the lay liberal person is like, you know, this is science, know, trust science, it's real.
264
Whereas from the inside, it's, you you pointed it out already there, but it's this
interesting irony that the whole point of science is that we're saying, I'm, I'm, I'm
265
never confident of anything.
266
I'm always open to this being wrong.
267
Yeah.
268
Yeah.
269
No, exactly.
270
and I think that's, that's kind of the distinction.
271
That's often made in epistemology actually between science on one hand and research on the
other end, where research is science in the making.
272
Science is like the collective knowledge that we've accumulated since basically the
beginning of modern science, at least in the Western hemisphere, so more or less during
273
the Renaissance.
274
Then research is people making that science because...
275
people have to do that and how do we come up with that?
276
so, yeah, like definitely I'm one who always emphasizes the fact that, yeah, now we know
the Earth is round.
277
We know how to fly planes, but there was a moment we didn't.
278
And so how do we come up with that?
279
And actually, maybe one day we'll discover that we were doing it kind of the wrong way,
you know, flying planes, but it's just like, for now it works.
280
We have the
281
best model that we can have right now with our knowledge.
282
But maybe one day we'll discover that there is a way better way to fly.
283
And it was just there staring at us and it took years for us to understand how to do that.
284
yeah, like as you were saying, but that's really hard line to walk because you have to
say, yeah.
285
Like these knowledge, these facts are really trustworthy, but you can never trust
something 100 % because otherwise mathematically, if you go back to base formula, you
286
actually cannot update your knowledge.
287
you, if you have a 0 % prior or 1 % prior, like mathematically, you cannot apply base
formula, which tells you, well, based on new data that you just observed the most
288
rational way of updating your belief is to believe that with that certainty.
289
But if you have zero or 100%, it's never going to be updated.
290
So you can say 99 .9999 % that what we're doing right now by flying is really good.
291
But maybe, like, you never know.
292
There is something that will appear.
293
And physics is a real...
294
We've all seen UFOs, Alex.
295
We know that there's better ways to fly.
296
Yeah.
297
Exactly.
298
Yeah, but yeah, think physics is actually a really good field for that because it's always
evolving and it's always coming up with really completely crazy paradigm shifting
299
explanation like relativity, special relativity, then general relativity just a century
ago that didn't exist.
300
And now we start to understand a bit better, but even now we don't really understand how
to
301
how to blend relativity and gravity.
302
so that's extremely interesting to me.
303
But yeah, I understand that politically from a marketing standpoint, it's hard to sell,
but I think it's shooting yourself in the foot if you're saying, yeah, is always like,
304
science works, I agree, science works, but it doesn't have to be 100 % true and sure.
305
for it to work.
306
That's why placebo works.
307
Placebos work, right?
308
It's just something that works even though it doesn't have any actual concrete evidence
that it's adding something, but it works.
309
yeah, like I think it's really shooting yourself in the foot by saying that no, that's
100%.
310
Like if you question science, then you're anti -science.
311
No.
312
Actually, it's the whole scientific methods to be able to ask questions all the time.
313
The question is how do you do that?
314
Do you apply the scientific method to your questions or do you just question anything like
that without any method?
315
And just because you fancy questioning that because it goes against your belief to begin
with.
316
So yeah, that's one thing.
317
And then I think another thing that you said I think is very interesting is,
unfortunately, I think the way of teaching science and communicating around it,
318
is not very incarnated.
319
It's quite dry.
320
You just learn equations and you just learn that stuff.
321
Whereas science was made by people and is made by people who have their biases, who have
extremely violent conflicts.
322
Like you were saying, Fisher was just a huge jerk to everybody around him.
323
I think it would be interesting to
324
get back to a bit of that human side to make science less dry and also less intimidating
thanks to that.
325
Because most of the time when I tell people what I do for a living, they get super
intimidated and they're like, my God, yeah, I hate math, I hate stats and stuff.
326
But it's just numbers.
327
It's just a language.
328
it's a bit dry.
329
For instance, if there is someone who is into movies, who does movies in your audience.
330
I want to know why there is no movie about Albert Einstein.
331
There has to be a movie about Albert Einstein.
332
Like not only huge genius, but like extremely interesting life.
333
Like honestly, it makes for great movie.
334
was working in a a dramatized biopic.
335
mean?
336
Yeah.
337
Yeah.
338
I mean, it's like his life is super interesting.
339
Like he revolutionized the field of two fields of physics and actually chemistry.
340
In 1905, it's like his big year, and he came up with the ideas for relativity while
working at the patent bureau in Bern in Switzerland, which was an extremely boring job.
341
In his words, it was an extremely boring job.
342
Basically, having that boring job allowed him to do that being completely outside of the
academic circles and so on.
343
It's like he makes for a perfect movie.
344
I don't understand why it's not there.
345
And then I sing on the cake.
346
He had a lot of women in his life.
347
So it's like, you know, it's perfect.
348
Like you have you have the sex you have, you have the drama, you have revolutionizing the
field, you have Nobel prizes.
349
And he and then he became a like a pop icon.
350
I don't know where the movies.
351
Yeah, it is wild.
352
Actually, now that you pointed out, it's kind of surprising that there aren't movies about
him all the time.
353
Like Spider -Man.
354
Yeah, I agree.
355
Well, there was one about Oppenheimer last year.
356
Maybe that started to trend.
357
see.
358
Yeah.
359
So in addition to the podcast, you also, I mentioned this at the outset, I said that your
co -founder and principal data scientist of
360
the popular Bayesian stats modeling platform, PyMC.
361
So like many things in data science, it's uppercase P, lowercase y for Python.
362
What's the MC, PyMC, one word, and C are capitalized.
363
Yeah.
364
So it's very confusing because it stands for Python and then MC is Monte Carlo.
365
So I understand.
366
But why Monte Carlo?
367
It's because it comes from Markov chain Monte Carlo.
368
So actually it should be pie MCMC or pie MC squared, which is what I'm saying since the
beginning.
369
anyways, yeah, it's actually, it's actually buying C squared.
370
so for Markov chain Monte Carlo and Markov chain Monte Carlo is one of the main ways that
all of their algorithms now, new ones, but like the blockbuster algorithm to run a patient
371
models is to use MCMC.
372
Yeah.
373
So in the same way that stochastic gradient descent is like the de facto standard for
finding your model weights in machine learning, Markov chain Monte Carlo is kind of the
374
standard way of doing it with a Bayesian network.
375
Yeah.
376
Yeah.
377
Yeah.
378
And, so now there are newer versions, more efficient versions.
379
That's, that's basically the name of the game, right?
380
Making the efficient, the algorithm more and more efficient.
381
but the first algorithm.
382
days back, I think it was actually invented during the project Manhattan.
383
during the world during World War Two.
384
Game of the day.
385
Yeah.
386
And lots of physicists actually, statistical physics is a film that's contributed a lot to
MCMC.
387
so yeah, like physicists who came to the field of statistics and trying to make the
algorithms more efficient for their models.
388
And yeah, so they buy
389
They have contributed a lot.
390
The field of physics has contributed a lot of big names and people to great leaps into the
realm of more efficient algorithms.
391
I don't know who your audience is, but that may sound boring.
392
Yeah, the algorithm, it's like the workhorse.
393
But it's extremely powerful.
394
And that's also one of the main reasons why patients' statistics are
395
increasing in popularity lately because
396
I'm going to argue that it's always been the best framework to do statistics, to do
science, but it was hard to do with pen and paper because the problem is that you have a
397
huge nasty integral on the numerator, on the denominator, sorry.
398
And this integral is not computable by pen and paper.
399
So for a long, long time, patient statistics combined to features, you know, like
campaigns.
400
PR campaigns, patients S6 was relegated to the margins because it was just super hard to
do.
401
so for other problems, other than very trivial ones, it was not very applicable.
402
But now with the advent of personal computing, you have these incredible algorithms like,
so now most of time it's HMC, Hamiltonian Monte Carlo.
403
That's what we use under the hood with PIMC.
404
But if you use Stan, if you use NumPyro, it's the same.
405
And thanks to these algorithms, now we can make extremely powerful models because we can
approximate the posterior distributions thanks to, well, computing power.
406
A computer is very good at computing.
407
I think that's why it's called that.
408
Yes.
409
And so that reminds me of deep learning.
410
It's a similar kind of thing where the applications we have today, like your chat GPT or
whatever your favorite large language model is these amazing video generation like Sora,
411
all of this is happening thanks to deep learning, which is an approach we've had since the
fifties, certainly not as old as Bayesian statistics, but similarly it has been able to
412
take off with much larger data sets and much more compute.
413
Yeah.
414
Yeah.
415
Yeah.
416
Yeah, very good point.
417
And I think that's even more the point in deep learning.
418
for sure.
419
Because Beijing stats doesn't need the scale, but the way we're doing deep learning for
now definitely need the scale.
420
Yeah, yeah.
421
Scale of data.
422
Yeah, exactly.
423
Yeah, sorry.
424
Yeah, the scale.
425
Because there two scales, data and...
426
Yeah, you're right.
427
Yeah, and for like model parameters.
428
And so that has actually, I mean, tying back to something you said near the beginning of
this episode is that actually one of the advantages of Beijing statistics is that you can
429
do it with very few data.
430
Yeah.
431
maybe fewer data than with a frequentist approach or machine learning approach.
432
Because you can bake in your prior assumptions and those prior assumptions give some kind
of structure, some kind of framework for your data to make an impact through.
433
Yeah, completely.
434
So for our listeners who are listening right now, if they are keen to try out Bayesian
statistics for the first time, why should they reach for PyMC?
435
Which, as far as I know, is the most used.
436
Bayesian framework, period.
437
And certainly in Python.
438
and then the second I'm sure is Stan.
439
yeah.
440
Yeah.
441
And, so, yeah, why, should somebody use pyMc and maybe even more generally, how can they
get started if they haven't done any Bayesian statistics before at all?
442
Yeah.
443
Yeah.
444
Yeah.
445
Fantastic question.
446
I think it's a, yeah, it's a very good one because, that can also be very intimidating.
447
And actually that can be a paradox of choice.
448
know, where now we're lucky to live in a world where we actually have a lot of
probabilistic programming languages.
449
So you'll see that sometimes that's called PPL and that's what's a PPL.
450
It's basically PINC.
451
It's a software that enables you to write down Bayesian models and sample from them.
452
Okay.
453
So it's just a fancy word to say that.
454
Yeah, my main advice is don't overthink it.
455
Like if you're proficient in R, then probably you want to try, I would definitely
recommend trying BRMS first because it's built on top of Stan and Stan is extremely good.
456
It's built by extremely good modelers and statisticians.
457
Lots of them have been on my podcast.
458
So if you're curious, just, just go there and you go on the website, you look for Stan and
you'll get a lot of them.
459
the best one is most of the time, Andrew Gellman, absolutely amazing to have him on the
show.
460
He, he always explains stuff extremely clearly.
461
but I also had Bob Carpenter, for instance, Matt Hoffman.
462
so anyways, if you know, or.
463
yeah.
464
Have you ever had Rob Tran Gucci on the show, or do you know who he is?
465
I know, but I have never had him on the show.
466
So, but I'd be happy to.
467
Yeah.
468
If you know him, I'll make an introduction for you.
469
He was on our show in episode number 507.
470
And that was our first ever Beijing episode.
471
And it was the most popular episode of that year, 2021, the most popular episode.
472
And it was interesting because also up until that time, at least with me hosting, 2021 was
my first year hosting the show.
473
And it was by far our longest episode.
474
I was like, that was kind of concerning for me.
475
was like, this was a super technical episode, super long.
476
I was like, how is this going to resonate?
477
It turns out that's what our audience loves.
478
And that's something we've been leaning into a bit in 2024 is more technical, longer.
479
Well, that's good to know.
480
Yeah.
481
I'll make an intro for Rob.
482
Anyway, you were saying I could do an intro for you.
483
Yeah, I know.
484
But yeah.
485
Great, great interruption for sure.
486
I'm happy to have that introduction made.
487
a lot.
488
Yeah, so I was saying, if you're proficient in R, definitely give a try to BRMS.
489
It's built on top of Stan.
490
Then when you outgrow BRMS, go to Stan.
491
If you love Stan, but you're using Python, there is PyStan.
492
I've never used that personally, but I'm pretty sure it's good.
493
and then, but I would say if you're proficient in Python and don't really want to go to R
then yeah, like you probably want to give a try to, to PIMC or to NumPyro.
494
You know, give that a try, see what, what resonate most with you, the, API most of the
time, because if you're going to make models like that, you're going to spend a lot of
495
time on your code and on your models.
496
And, as most of your audience probably know, like
497
The models always fail unless it's the last one.
498
So, yeah, you have to love really the framework you're using and find it intuitive.
499
Otherwise, it's going to be hard to keep it going.
500
If you're really, really a beginner, I would also recommend on the Python realm, give a
try to Bambi, which is the equivalent of BRMS, but in Python.
501
So Bambi is built on top of Climacy.
502
And what it does, it does a lot of the choices for you.
503
It makes a lot of the choices for you under the hood.
504
So priors, stuff like that, which can be a bit overwhelming to beginners at the beginning.
505
But then when you outgrow Bambi, you want to make more complicated models, then go to
BiMC.
506
Bambi, that's a really cute name for a model that's just like, it just drops out of its
mother and can barely stand up straight.
507
Yeah, And the guys working on Bambi, so Tommy Capretto, Osvaldo Martino.
508
So they are like, yeah, really great guys.
509
Both Argentinians, actually.
510
And yeah, like they are fun guys.
511
I think the website for Bambi is bambinos .github .com.
512
yeah, like these guys.
513
These guys are fun.
514
But yeah, it's definitely a great framework.
515
And actually, this week, we released with Tommy Capretto and Ravin Kumar.
516
We actually released an online course, our second online course that we've been working on
for two years.
517
So we are very happy to have released it.
518
But we're also very happy with the course.
519
That's why it took so long.
520
It's a very big course.
521
And that's exactly what we do.
522
We take you from beginner.
523
We teach you Bambi, teach you Pinsene and you go up until advanced.
524
That's called advanced regression.
525
So we teach you like all things regression.
526
What's the course called?
527
Advanced regression.
528
Yeah.
529
Advanced regression on the intuitive base platform that you were kind enough to mention at
the beginning.
530
Nice.
531
Yeah.
532
I'll be sure to include that in the show notes.
533
And so even though it's called advanced regression, you start us off with an introduction
to Bayesian statistics and we start getting our
534
with Bambi before moving on to PyMC, yeah?
535
Yeah, yeah, yeah.
536
So you have a regression refresher at the beginning.
537
If you're a complete, complete beginner, then I would recommend taking our intro course
first, which is really here.
538
It's really from the ground up.
539
The advanced regression course, well, ideally you would do that after the intro course.
540
But if you're already there in your learning curve, then you can start with the intro
course.
541
It makes a bit more assumption on
542
on the student's part, like, yeah, they have heard about Bayesian stats.
543
They are aware of the ideas of priors, likelihood, posteriors.
544
But we give you a refresher about the classic progression.
545
it's like when you have a normal likelihood.
546
And then we teach you how to generalize that framework to data that's not normally
distributed.
547
And we start with BAMBEE.
548
We show you how to do the equivalent models in PIMEC.
549
And then at the end, the model became, becomes, become like much more complicated, then we
just show it in point C.
550
Nice.
551
That is super, super cool.
552
I hope to be able to find time to dig into that myself soon.
553
It's one of those things.
554
yeah.
555
You and I were lamenting this before the show, podcasting of itself can take up so much
time on top of, in both of our cases, we have full -time jobs.
556
This is something that we're doing as a hobby.
557
And it means that I'm constantly talking to amazingly interesting people like you who have
developed fascinating courses that I want to be able to study.
558
And it's like, when am going to do that?
559
Like book recommendations alone.
560
Like I barely get to read books anymore.
561
That was something like since basically the pandemic hit.
562
I, and it's, it's so embarrassing for me because I, I identify in my mind as a book
reader.
563
And sometimes I even splurge.
564
I'm like, wow, I've got to get like these books that I absolutely must read.
565
And they just collect in stacks around my apartment.
566
Like, yeah.
567
Yeah.
568
Yeah.
569
I mean, that's hard for sure.
570
yeah, it's something I've also been trying to, get under control a bit.
571
yeah, like I find, so a guy who does good work, I find on that is, can you port
572
Yes, Cal Newport, of course.
573
I've been collecting his books too.
574
Yeah, that's the irony.
575
So he's got a podcast.
576
don't know about you, but me, I listen to tons of podcasts.
577
So the audio format is really something I love.
578
So podcasts and audio books.
579
yeah, that can be your entrance here.
580
Maybe you can listen to more books if you don't have time to write.
581
Yeah, it's an interesting...
582
don't really have a commute.
583
and I often, use like, you know, when I'm traveling to the airport or something, I use
that as an opportunity to like do catch up calls and that kind of thing.
584
So it's interesting.
585
I, I, I almost listened to no other podcasts.
586
The only show I listened to is last week in AI.
587
I don't know if you know that show.
588
Yeah.
589
Yeah.
590
Yeah.
591
Great show.
592
I like them a lot.
593
put a lot of work into Jeremy and Andre do a lot of work to get.
594
Kind of all of the last week's news constant in there.
595
so it's impressive.
596
It allowed me to flip from being this person where prior to finding that show and I found
it cause Jeremy was a guest on my show.
597
was an amazing guest by the way.
598
don't know if he'd have much to say about Bayesian statistics, but he's an incredibly
brilliant person is so enjoyable to listen to.
599
and, and someone else that I'd love to make an intro for you.
600
He's, he's become a friend over the years.
601
Yeah, for sure.
602
But yeah, last week in AI, they, I don't know why I'm talking about it so much, but they,
I went from being somebody who would kind of have this attitude when somebody would say,
603
if you heard about this release or that, or, I'd say, you know, just because I work in AI,
I can't stay on top of every little thing that comes out.
604
And now since I started listening to last week in AI about a year ago, I don't think
anybody's caught me off guard with some, with some new release.
605
I'm like, I know.
606
Yeah, well done.
607
Yeah, that's good.
608
Yeah, but that makes your life hard.
609
Yeah, for sure.
610
If you don't have a commute, come on.
611
But I'd love to be able to completely submerge myself in Bayesian statistics.
612
is a life goal of mine, is to be able to completely, because while I have done some
Bayesian stuff and in my PhD, I did some Markov chain Monte Carlo work.
613
And there's just obviously so much flexibility and nuance to this space.
614
can do such beautiful things.
615
I have a huge fan of Bayesian stats.
616
And so yeah, it's really great to have you on the show talking about it.
617
So, Pi MC, which we've been talking about now, kind of going back to our, back to our
thread.
618
Pi MC uses something called Pi tensor to leverage GPU acceleration and complex graph
optimizations.
619
Tell us about PyTensor and how this impacts the performance and scalability of Bayesian
models.
620
Yeah.
621
Great question.
622
basically the way PyMAC is built is we need a backend.
623
And historically this has been a complicated topic because the backend, then we had to do
the computation.
624
Otherwise you have to do the computations in Python.
625
And that's slower than doing it in C, for instance.
626
And so we have still that C backend.
627
That's kind of a historical remnant, but more and more we're using.
628
when I say we, I don't do a lot of PyTensor code to be honest.
629
mean, contributions to PyTensor.
630
I mainly contribute to PyC.
631
PyTensor is spearheaded a lot by Ricardo Viera.
632
Great.
633
great guy, extremely good modeler.
634
basically the idea of PyTensor is to kind of outsource the computation basically that PymC
is doing.
635
And then, especially when you're doing the sampling, PyTensor is going to delegate that to
some other backends.
636
And so now instead of having just the C backend,
637
you can actually sample your PIMC models with the number backend.
638
How do you do that?
639
You use another package that's called nutpy that's been built by Adrian Seybold, extremely
brilliant guy again.
640
I'm like surrounded by guys who are much more brilliant than me.
641
And that's how I learned basically.
642
I just ask them questions.
643
That's what I feel like in my day job at Nebula, my software company.
644
just like, Yeah, sorry.
645
I'm just completely interrupting you.
646
Yeah, no, same.
647
And so, yeah.
648
So Adrian basically re -implemented HMC with NutPy, but using Numba and Rust.
649
And so that goes way faster than just using Python or even just using C.
650
And then you can also sample your models with two other backends that we have that's
enabled by PyTensor that then basically compiles the graph of the model and then delegates
651
these operations, computational operations, to the sampler.
652
And then the sampler, as I was saying, can be the one from NutPy, which is in Rust and
Numba.
653
And otherwise, it can be the one from NumPyro.
654
actually, you can call the NumPyro sampler with a PIMC model.
655
And it's just super simple.
656
Like in pm .sample, you're just like, there's a keyword argument that's nuts underscore
sampler and you just say nutpy or NumPyro.
657
And I tend to use NumPyro a lot when I'm doing Gaussian processes because I don't know
why, but so most of the time using nutpy, but when I'm doing Gaussian processes somewhere
658
in the model, I tend to use NumPyro because like for some reason in their routine, in
their
659
algorithm, there is some efficiency they have in the way they compute the matrices.
660
And GPs are basically huge matrices and dot products.
661
so yeah, like usually NumPyRoll is usually very efficient for that.
662
And you can also use Jax now to sample your model.
663
So we have like these different backends and it's enabled because PyTensor is that
664
backend that nobody sees most of the time you're not implementing a Python or operation in
your models.
665
Sometimes we do that on PNC Labs when we're working on a very custom operation, but
usually it's done under the hood for you.
666
And then Python compiles the graph, the symbolic graph, can dispatch that afterwards to
whatever the best way of computing the posterior distribution afterwards is.
667
Nice.
668
You alluded there.
669
to something that I've been meaning to get to asking you about, which is the Pi MC labs
team.
670
So you have Pi MC, the open source library that anybody listening can download.
671
And of course I haven't shown us for people to download and they can get rolling on doing
their Bayesian stats right now, whether they're, it's already something they have
672
expertise in or not.
673
Hi, MC labs.
674
It sounds like you're responsible and just fill us in, but I'm kind of, gathering that the
team there is responsible both for developing.
675
IMC, but also for consulting because you kind of you mentioned there, you know, sometimes
we might do some kind of custom implementation.
676
So first of all, yeah, tell us a little bit about PI MC labs.
677
And then it'd be really interesting to hear one or more interesting examples of how
Bayesian statistics allows some client or some use case, allows them to do something that
678
they wouldn't be able to do with another approach.
679
Yeah.
680
So yeah, first, go install PyMC on GitHub and open PRs and stuff like that.
681
We always love that.
682
And second, yeah, exactly.
683
PyMC is kind of an offspring of PyMC in the sense that everybody on the team is a PyMC
developer.
684
So we contribute to PyMC.
685
This is open source.
686
This is
687
free.
688
This is free and always will be as it goes.
689
But then on top of that, we do consulting.
690
what's that about?
691
Well, most of the time, these are clients who want to do something with PMC or even more
general with patient statistics.
692
And they know we do that and they do not know how to do that either because they don't
have the time or to
693
train themselves or they don't want to, or they don't have the money to hire a Bayesian
modeler full time, various reasons.
694
But basically, yeah, like they are stuck in at some point in the modeling workflow, they
are stuck.
695
It can be at the very beginning.
696
It can be, well, I've tried a bunch of stuff.
697
I can't make the model converge and I don't know why.
698
So it can be like a very wide array of situations.
699
Most of the time people know.
700
us because like me for the podcast or for PMC, most of the other guys for PMC or for other
technical writing that they do around.
701
So basically that's like, that's not really a real company, but just a bunch of nerds if
you want.
702
But no, that's a real company, but we like to define us as a bunch of nerds because like
that's how it really started.
703
And it, in a sense of
704
you guys actually consulting with companies and making an impact in that sense, it is
certainly a company.
705
Yeah.
706
So yeah.
707
So tell us a bit about projects.
708
mean, you don't need to go into detail with client names or whatever, if that's
inappropriate, but it would be interesting to hear some examples of use cases, use cases
709
of Bayesian statistics in the wild, enabling capabilities that other kinds of modeling
approaches wouldn't.
710
Yeah.
711
Yeah.
712
Yeah.
713
No, definitely.
714
Yeah, so of course I cannot enter into the details, but I can definitely give you some
ideas.
715
When I can actually enter into the details is a project we did for an NGO in Estonia,
where they were getting polling data.
716
So every month they do a poll of Estonian citizens about various questions.
717
These can be horse -pull.
718
races, horse races, polls.
719
but this can be also, you know, news questions like, do you think Estonia should ramp up
the number of soldiers at the border with Russia?
720
do you think, same sex marriage should be legal?
721
Things like that.
722
I hear some Overton window coming on.
723
No, that's what I thought.
724
I thought we might go there.
725
Yeah, this is now I'm completely taking you off on a sidetrack, but Serge Macisse, our
researcher came up with a great question for you because you had Alan Downey on your show.
726
He's an incredible guest.
727
I absolutely loved having him on our program.
728
So he was on here in episode number 715.
729
And in that episode, we talked about the Overton window, which is related to what you're
kind of just talking about.
730
So kind of, you know, people
731
What is, how does society think about say same sex marriage?
732
Where, know, if you looked a hundred years ago or a thousand years ago or 10 ,000 years
ago or a thousand years into the future or 10 years into the future at each of those
733
different time points, there's a completely, well, maybe not completely different, but
there's a there's a varying range of people who think, you know, what's acceptable or
734
what's not acceptable.
735
And so, and this is
736
You we were talking earlier in the episode about bias, so it kind of ties into this.
737
You, you know, you might have your idea as a listener to the show, you might be a
scientist or an engineer and you think, I am unbiased, you know, I, know the real thing
738
and, but you don't because you are a product of your times.
739
And the Overton window is kind of a way of describing this on any given issue.
740
There is some range.
741
And it would fit a probability distribution where, you know, there's some people on a far
extreme one way and some people on a far extreme the other way.
742
But in general, all of society is moving in one direction, typically in a liberal
direction on a given social issue.
743
And this varies by region.
744
It varies by age.
745
Anyway, I think Overton windows are really fascinating.
746
and, yeah.
747
so
748
Completely derailed your conversation, but I have feeling you're going to have something
interesting to say.
749
Yeah, no, mean, that's related to that for sure.
750
yeah, basically, and that's really because yeah, we like I had also Alan Downey on the
show for his latest book.
751
And that was also definitely about that.
752
So probably overthinking it was the book.
753
Yeah.
754
Yeah.
755
Great.
756
Great.
757
Great.
758
And yeah, great book.
759
And so basically they like the, NGO, have, they had these survey data, right?
760
And they're like, but there are the clients have questions and their clients are usually
media or politicians.
761
it's like, yeah, but I'd like to know on a geographical basis, you know, like in these
electoral districts, what do people think about that?
762
Or in these electoral district, female educated.
763
of that age, what do they think about same -sex marriage?
764
That's hard to do because polling at that scale is almost impossible.
765
It costs a ton of money.
766
Also, polling is harder and harder because people answer less and less to polls.
767
At the same time, the polling data becomes less available and less reliable, but
768
you have people who get more interested in what the polls have to say.
769
It's hard.
770
There is a great method to do that.
771
What we did for them is come up with a hierarchical model of the population because
hierarchical models allow you to share information between groups.
772
Here the groups could be the age groups, for instance.
773
Basically, knowing something what a hierarchical model says,
774
is, well, age groups are different, but they are not infinitely different.
775
So learning about what someone aged 16 to 24 thinks about same -sex marriage actually
already tells you something about what someone aged 25 to 34 thinks about that.
776
And the degree of similarity between these responses is estimated by the model.
777
So these models are extremely powerful.
778
I love them.
779
I teach them a lot.
780
And actually in the advanced regression course, the last lesson is all about hierarchical
models.
781
And I actually walk you through a simplified version of the model we did at Pintsy Labs
for that NGO called SONC in Estonia.
782
So it's like a model that's used in industry for real.
783
you learn that.
784
That's a hard model, but that's a real model.
785
Then once you've done that, you do something that's called post stratification.
786
And post stratification is basically a way of debiasing your estimates, your predictions
from the model, and you use census data to do that.
787
So you need good data and you need census data.
788
But if you have good census data, then you're going to be able to basically reweight the
predictions from your model.
789
And that way, if you combine
790
post -certification and hierarchical model, you're going to be able to give actually good
estimates of what females, educated, age 25, 34 in this electoral district think about
791
that issue.
792
And when I say good, I want to say that it's like the confidence intervals are not going
to be ridiculous.
793
It's not going to tell you
794
Well, these population think is opposed to gay marriage with a probability of 20 to 80%,
which just covers basically everything.
795
So that's not very actionable.
796
No, the model has like, it's more uncertain, of course, but it has a really good way of
giving you something actually actionable.
797
So that was a big project.
798
I can dive into some others if you want, but that...
799
That takes some, I don't want to deride the interview.
800
That's great and highly illustrative.
801
It gives that sense of how with a Bayesian model, you can be so specific about how
different parts of the data interrelate.
802
So in this case, for example, you're describing having different demographic groups that
have some commonality, like all the women, but different age groups of women as a sub
803
node, as sub nodes of women in general.
804
That way you're able to use the data from each of the subgroups to influence your higher
level group.
805
And actually something that might be interesting to you, Alex, is that my introduction to
both our programming and I guess, well, to hierarchical modeling is Gelman and Hill's
806
book, which yeah, obviously Andrew Gelman you've already talked about on the show.
807
Jennifer Hill also
808
brilliant causal modeler and has also been on the Super Data Science podcast.
809
And that was episode number 607.
810
Anyway, we're getting into lots of, there's lots of listening for people to do out there
between your show and mine based on guests that we've talked about on the program.
811
Hopefully lots of people with long commutes.
812
So yeah, fantastic.
813
That's a great example.
814
Alex.
815
Another library, open source library, in addition to PyMC that you've developed is Rviz,
which has nothing to do with the programming language R.
816
So it's A -R -V -I -Z or Zed, Rviz.
817
And this is for post -modeling workflows in VasionStats.
818
So tell us about why, you know, what is post -modeling workflows?
819
What does that matter?
820
And how does Rviz solve problems for us there?
821
Yeah, yeah, great questions.
822
And I'll make sure to also, before it related to your previous question, send you some
links with other projects that could be interesting to people like MediaMix Models.
823
I've interviewed Luciano Paz on the show.
824
We've worked with HelloFresh, for instance, to come up with a MediaMix Marketing Model for
them.
825
Luciano talks about that in that episode.
826
Also send you a blog post with spatial data, with Gaussian processes.
827
That's something we've done for an agricultural client.
828
And I already sent you a link of a video webinar we did with that NGO, with that client in
Estonia.
829
And we talked, we go a bit deeper into the project.
830
And I'll send you of course also the...
831
the Learn Based Stats episode because the president, Thermo, the president of that NGO was
on the show.
832
Nice.
833
Yeah.
834
I'll be sure, of course, to include all of those links in the show notes.
835
Yeah.
836
Yeah.
837
Because I guess people come from different backgrounds and so someone is going to be more
interested in marketing, another one more in social science, another one more in spatial
838
data.
839
So that way people can pick and choose what they are most curious about.
840
So obvious.
841
Yeah.
842
What is it?
843
That's basically your friend for any post -model, post -sampling graph.
844
And why is that important?
845
Because actually models steal the show and they are the star on the show.
846
But a model is just one part of what we call the Bayesian workflow.
847
And the Bayesian workflow just has one step, which is the modeling.
848
But all the other steps don't have to do anything with the model.
849
There is a lot of steps before sampling the model and then there is a lot of steps
afterwards.
850
And I would argue that these steps afterwards are almost as important as the model.
851
Why?
852
Because it's what's going to face the customer of the model.
853
Your model is going to be consumed by people who most of the time don't know about models
and also often don't care about models.
854
That's a shame because I love models, but you know, lots of the time they don't really
care about the model.
855
They care about the results.
856
And so a big part of your job as the modeler is to be able to convey that information in a
way that someone who is not a stat person, a math person can understand and use in their
857
work.
858
Whether that is a football coach.
859
or a data scientist or someone working in HelloFresh marketing department, you have to
adapt the way you talk to those people and the way you present the results of the model.
860
And the way you do that is with amazing graphs.
861
So a lot of your time as a modeler is spent thinking out how to decipher what the model
can tell, what the model cannot tell.
862
also very important, and with which confidence, and since we're humans, we use our eyes a
lot, and the way to convey that is with plots.
863
And so you spend a lot of time plotting stuff as a Bayesian modeler, especially because
Bayesian models don't give you, you one -point estimate.
864
They give you full distributions for all the parameters.
865
So you get distributions all the way down.
866
So,
867
That's a bit more complex to wrap your head around at the beginning, but once your brain
is used to that gym, that's really cool because that gives you opportunities for amazing
868
plots.
869
yeah, like RVs is here for you for that.
870
It has a lot of the plots that we use all the time in the Bayesian workflow.
871
One, to diagnose your model.
872
So to understand if there is any red flag in the convergence of the model.
873
And then once you're sure about the quality of your results, then how do you present that
to the customer of the model?
874
then, then obvious also has a lot of plots for you here.
875
And the cool thing of obvious is that it's platform agnostic.
876
What do I mean by that?
877
It's that you can run your model in PIMC in a pyro in Stan, and then use obvious because
obvious is expecting a special format of data that all these PPLs can give you, which is
878
called the inference data object.
879
Once you have that, ARVIS doesn't care where the model was run.
880
And that's super cool.
881
And also it's available in Julia.
882
So that's Python package, but there is the Julia equivalent for people who use Julia.
883
So yeah, it's a very good way of starting that part of the workflow, which is extremely
important.
884
Nice.
885
That was a great tour.
886
And of course, I will again have a link to ARVIS in the show notes for people who want to
be using that for your post modeling needs with your vision models.
887
including diagnostics like looking for red flags and being able to visualize results and
passes off to whoever the end client is.
888
I think it might be in the same panel discussion with the head of that NGO, Tormo Uristo.
889
Yes.
890
That's my Spanish.
891
I'm in Argentina right now, so the Spanish is automatic.
892
Actually, I'm relieved to know that you're in Argentina because I was worried that I was
keeping you up way too late.
893
No, no, no, no, no.
894
Nice.
895
So yeah, so in that interview, Tarmu talks about adding components like Gaussian processes
to make models, Bayesian models, time aware.
896
What does that mean?
897
And what are the advantages and potential pitfalls of incorporating advanced features like
time awareness into Bayesian models?
898
Yeah, great research, can see that.
899
Great research, Serge Pessis, really.
900
Yeah.
901
No, that's impressive.
902
I had a call with the people from Google Gemini today.
903
So they're very much near the cutting edge of developing Google Gemini alongside Cloud 3
for Anthropic and of course, GPT -4, GPT -4 -0, whatever, from OpenAI.
904
These are the frontier of LLMs.
905
So I'm on a call with...
906
half a dozen people from the Google Gemini team.
907
And they were insinuating kind of near the end with some of the new capabilities they
have.
908
And there are some cool things in there, which I need to spend more time playing around
with like gems.
909
I don't know if you've seen this, but the gems in Google Gemini, they allow you to have a
context for different kinds of tasks.
910
like, for example, there are some parts of my podcast production workflow where I have
different context, different needs.
911
at each of those steps.
912
And so it's very helpful with these Google Gemini gems to be able to just click on that
and be like, okay, now I'm in this kind of context.
913
I'm expecting the, the LLM to output in this particular way.
914
And the Google Gemini people said, well, and maybe you'll be able to use these gems to
kind of be replacing, you know, within the workflow of other people working on your
915
podcast, you'll be able to use them to replace.
916
I was like, you know, you know, for example, they give the example of research and I was
like,
917
I hope that our researcher, for example, is using generative AI tools to assist his work.
918
But I think we're quite a ways away with all of the amazing things that alums can do.
919
I think we're still quite a ways from like the kind of quality of research that search mrs
can do for this show.
920
Yeah, yeah, yeah.
921
We're still a ways away.
922
Yeah, yeah, no, no, for sure.
923
But that sounds like fun.
924
Yeah.
925
Anyway, sorry, I derailed you again.
926
Time warrants.
927
Yeah, Indeed.
928
And then I love that question because I love GPs.
929
So thanks a lot.
930
And that was not at all a setup for the audience, processes, GPs, Yeah, I love Gaussian
processes.
931
And actually just sent you also a blog post we have on the PrimeCLabs website by Luciano
Paz about how to use Gaussian processes with spatial data.
932
So why am I...
933
telling you that because Gaussian processes are awesome because they are extremely
versatile.
934
It's what's called a non -parametric that allows you to do non -parametric models.
935
What does that mean?
936
It means that instead of having, for instance, a linear regression where you have a
functional form that you're telling the model, I expect the relationship between X and Y
937
to be of a linear form.
938
y equals a plus b times x.
939
Now, what the Gaussian process is saying is, I don't know the functional for between x and
y.
940
I want you to discover it for me.
941
So that's one level up, if you want, in the abstraction.
942
And so that's saying y equals f of x.
943
Find which x it is.
944
So you don't want to do that all the time because that's very hard.
945
And actually, you need to use quite a lot of domain knowledge on some of the parameters of
the GPs.
946
But I want to turn to the details here, but I'll give you some links for the show notes.
947
But something that's very interesting to apply GPs on is, well, spatial data, as I just
mentioned, because you don't really know in a plot, for instance, of, so not a plot graph,
948
but a plot, a field plot.
949
There are some interactions between where you are in the plot and the crops that you're
going to plant on there.
950
But you don't really know what those interactions are.
951
Like it interacts with the weather also with a lot of things.
952
And you don't really know what the functional form of that is.
953
And so that's where GP here is going to be extremely interesting because it's going to
allow you to see in 2D, try and find out what these correlation between X and Y are and
954
take that into account in your model.
955
That's very abstract, but I'm going to link afterwards to a tutorial.
956
We actually just released today in PIMC a tutorial that I've been working on with Bill
Engels, who is also a GP expert.
957
And I've been working with him on this tutorial for an approximation, a new approximation
of GPs.
958
And I'll get back to that in a few minutes.
959
But first, why GPs in time?
960
So you can apply GPs on spatial data on space.
961
But then you can also apply GPs on time.
962
Time is 1D most of the time, one dimensional.
963
Space is usually 2D.
964
And you can actually do GPs in 3D.
965
You can do spatial temporal GPs.
966
That's it.
967
That's even more complicated.
968
But like 1D GPs, that's really awesome.
969
Because then most of the time when you have a time dependency, it's non -linear.
970
For instance, that could be the way the performance of a baseball player evolve within the
season.
971
You can definitely see the performance of a baseball player fluctuate with time during the
season.
972
And that would be nonlinear, very probably.
973
the thing is you don't know what the form of that function is.
974
And so that's what the GP is here for.
975
It's going to come and try to discover what the functional form is for you.
976
And that's why I found GPs extremely, like they are really magical mathematical beasts.
977
First, they're really like beautiful mathematically.
978
And a lot of things are actually a special case of GPs, like neural networks.
979
Neural networks are actually Gaussian processes, but a special case of Gaussian processes.
980
Gaussian random walk.
981
They are a special case of Gaussian processes.
982
So they are a very beautiful mathematical object, but also very practical.
983
Now, as Uncle Ben said, with great power comes great responsibility.
984
And GPs are hard to yield.
985
It's a powerful weapon, but it's hard to yield.
986
It's like Excalibur.
987
You have to be worthy to yield it.
988
And so it takes training and time to...
989
use them, but it's worth it.
990
And so we use that with Thermo, Juristo from that Estonian NGO.
991
But I use that almost all the time.
992
Right now I'm working more and more on sports data and yeah, I'm actually working on some
football data right now.
993
And well, you want to take into account
994
these wheezing season effects from players.
995
I don't know what the linear form is.
996
And right now, the first model I did taking the time into account was just a linear trend.
997
So it's just saying as time passes, you expect a linear change.
998
So the change from one to two is going to be the same one than the one from nine to 10.
999
But usually it's not the case with time.
Speaker:
It's very non -linear.
Speaker:
And so here, you definitely want to apply a GP on that.
Speaker:
You could apply other stuff like random walks.
Speaker:
autoregressive stuff and so on.
Speaker:
I don't personally don't really like those models.
Speaker:
find them like, it's like you have to apply that structure to the model, but at the same
time, they're not that easier to use than GPs.
Speaker:
So, know, might as well use a GP.
Speaker:
And I'll end this very long answer with a third point, which is now it's actually easier
to use GPs.
Speaker:
because there is this new decomposition of GPs that's Hilbert space decomposition.
Speaker:
So HSGP.
Speaker:
And that's basically a decomposition of GPs that's like a dot product.
Speaker:
So kind of a linear regression, but that gives you a GP.
Speaker:
And that's amazing because GPs are known to be extremely slow to sample because it's a lot
of matrix multiplication, as I was saying.
Speaker:
at some point.
Speaker:
But with HSGP, it becomes way, way faster and way more efficient.
Speaker:
Now, you cannot always use HSGP, there are caveats and so on.
Speaker:
But Bill and I have been working on this tutorial.
Speaker:
It's going to be in two parts.
Speaker:
The first part was out today.
Speaker:
And I'm going to send you the links for the show notes here in the chat we have.
Speaker:
That's up on the PMC website.
Speaker:
If you go, that's HSGP First Steps and Reference.
Speaker:
And we go through why you would use HSGP, how you would use it in PMC, and the basic use
cases.
Speaker:
And the second part is going to be the more advanced use cases.
Speaker:
Bill and I have started working on that, but it's always taking time to develop good
content.
Speaker:
on that front.
Speaker:
yeah, we're getting there and it's open source, so we're doing that on our free time
unpaid.
Speaker:
that always takes a bit more time.
Speaker:
But we'll get there.
Speaker:
And finally, another resource that I think your listeners are going to appreciate is I'm
doing a webinar series on HSGP where we have a modeler that
Speaker:
who comes on the show and shares our screen and does live coding.
Speaker:
so the first part is out already.
Speaker:
I'm going to send you that for the show notes.
Speaker:
had Juan Orduz on the show and he went into like, the first part of how to do HSGPs and
what are even HSGPs from a mathematical point of view because Juan is a mathematician.
Speaker:
So yeah, like I'll end by my very long, passionate rant about about GPS here, but long
story short, GPS are amazing.
Speaker:
And that's a good investment of your time to be skillful with GPS.
Speaker:
Fantastic.
Speaker:
Another area that I would love to be able to dig deep into.
Speaker:
And so our lucky listeners out there who have the time will now be able to dig into that
resource and many of the others that you have.
Speaker:
suggested in this episode, which we've all got for you in the show notes.
Speaker:
Thank you so much.
Speaker:
Alex, this has been an amazing episode.
Speaker:
Before I let my guests go, I always ask for book recommendation and you've had some
already for us in this episode, but I wonder if there's anything else.
Speaker:
you had, the recommendation you already had was Bernoulli is something about Bernoulli.
Speaker:
Right.
Speaker:
Bernoulli's fallacy, the crises, the crisis of modern science and the logic of science,
think.
Speaker:
yeah.
Speaker:
Yeah, I'll send you that actually, the episode with Aubrey and David Spiegelhalter,
because these are really good, especially for less technical people, but who are curious
Speaker:
about science and how that works.
Speaker:
think it's very good entry point.
Speaker:
Yeah, so this book is amazing.
Speaker:
my God, this is an extremely hard question.
Speaker:
I love so many books and I read so many books that I'm taken aback.
Speaker:
So books, I would say books which...
Speaker:
I find extremely good and have influenced me because a book is also like, it's not only
the book, right?
Speaker:
It's also the moment when you read the book.
Speaker:
So like, yeah, if you like a book and you come back to it later, you'll have a different
experience because you are a different person and you have different skills.
Speaker:
And yeah, so I'm going to cheat and give you several recommendations because I have too
many of them.
Speaker:
so technical books, I would say
Speaker:
The Logic of Science by E .T.
Speaker:
Jains.
Speaker:
E .T.
Speaker:
Jains is an old mathematician scientist, it's like in the Bayesian world, E .T.
Speaker:
Jains is like a rock star.
Speaker:
Definitely recommend his masterpiece, The Logic of Science.
Speaker:
That's a technical book, but that's actually a very readable book.
Speaker:
And that's also very epistemological.
Speaker:
So that one is awesome.
Speaker:
Much more applied if you want to learn.
Speaker:
patient stats.
Speaker:
A great book to do that, Statistical Rethinking by Richard Michael Rath.
Speaker:
Really great book.
Speaker:
I've read it several times.
Speaker:
Any book by Andrew Gellman, as you were saying, definitely recommend them.
Speaker:
They tend to be a bit more advanced.
Speaker:
If you want a really beginner's one, his last one, actually Active Statistics, he's a
really good one.
Speaker:
I just had him on the show, episode 106.
Speaker:
John?
Speaker:
for people who like numbers, say like that.
Speaker:
And I remember that when I was studying political science, Barack Obama's book from before
he was president, I don't remember the name.
Speaker:
I think it's the Audacity of Hope, I'm not sure.
Speaker:
But his first book before he became president, that was actually a very interesting.
Speaker:
Dreams from my father?
Speaker:
Yes.
Speaker:
Yeah, yeah, this one.
Speaker:
Dreams from my father.
Speaker:
Yeah, yeah, yeah.
Speaker:
Very interesting one.
Speaker:
The other ones were a bit more political.
Speaker:
I found a bit less interesting.
Speaker:
But this one was really interesting to me.
Speaker:
And another one for people who are very nerdy.
Speaker:
So I'm a very nerdy person.
Speaker:
I love going to the gym, for instance.
Speaker:
I have my, I can do my own training plan, my own nutritional plan.
Speaker:
I've dug into that research.
Speaker:
I love that.
Speaker:
because I love sports also.
Speaker:
Another very good book I definitely recommend to develop good habits is Kathy Milkman's
How to Change, the Science of Getting from Where You Are to Where You Wanna Go.
Speaker:
Extremely good book full of very practical tips.
Speaker:
Yeah, that's an extremely good one.
Speaker:
And then a last one that I read recently.
Speaker:
no, actually two last ones.
Speaker:
All right.
Speaker:
Yeah.
Speaker:
One last two.
Speaker:
Penultimate, I think you say.
Speaker:
How Minds Change by David McCraney for people who are interested in how beliefs are
formed.
Speaker:
Extremely interesting.
Speaker:
he's a journalist.
Speaker:
He's got a fantastic podcast that's called You're Not So Smart.
Speaker:
And I definitely recommend that one.
Speaker:
And yeah, that's how people change their mind, basically, because I'm very interested in
that.
Speaker:
And in the end, is a, like, it's a trove of wisdom, this book.
Speaker:
And very, like, last one, Promise.
Speaker:
I also am extremely passionate about Stoicism, Stoic philosophy.
Speaker:
And that's a philosophy I find extremely helpful to live my life and navigate the
difficulties that we all have in life.
Speaker:
And
Speaker:
a very iconic book in these is Meditations from Marcus Aurelius, reading the thoughts of a
Roman emperor, one of the best Roman emperor there was.
Speaker:
It's really fascinating because he didn't treat that to be published.
Speaker:
It was his journal, basically.
Speaker:
That's absolutely fascinating to read that and to see that they kind of had the same
issues we still have.
Speaker:
You know, so that's yeah, fantastic.
Speaker:
It's my, I read it very often.
Speaker:
Yeah.
Speaker:
I haven't actually read meditations, but I read Ryan holidays, the daily stoic.
Speaker:
Yeah.
Speaker:
yeah.
Speaker:
Yeah.
Speaker:
Yeah.
Speaker:
And that's really good.
Speaker:
it's, yeah, 366 daily meditations on wisdom, perseverance, and the art of living and based
on Stoke philosophy.
Speaker:
And so there is a lot from Marcus or really soon there.
Speaker:
He's probably the plurality of content and wow, it is.
Speaker:
It is mind blowing to me how somebody to millennia ago is the same as me.
Speaker:
mean, I mean, that's holding myself in.
Speaker:
I'm not a Roman emperor.
Speaker:
the things I write will not be studied 2000 years from now.
Speaker:
But nevertheless, the connection you feel with this individual from 2000 years ago and the
problems that he's facing and how similar they are.
Speaker:
the problems that I face every day, it's staggering.
Speaker:
Yeah.
Speaker:
Yeah.
Speaker:
Yeah.
Speaker:
No, that's incredible.
Speaker:
Me, something that really talked to me was, well, that I remember is at some point he's
saying to himself that it's no use to go to the countryside to escape everything because
Speaker:
the real retreat is in yourself.
Speaker:
It's like, if you're not able to be calm and
Speaker:
find equanimity in your daily life.
Speaker:
It's not because you're going to get away from the city and Rome was like the megalopolis
at the time.
Speaker:
It's not because you get away from the city that you're going to find tranquility over
there.
Speaker:
You have to find tranquility inside.
Speaker:
then, yeah, you'll go to the country that is going to be even more awesome.
Speaker:
But it's not because you go there that you find tranquility.
Speaker:
And that was super interesting to me because I was like, wait, because I definitely feel
that when I'm in a
Speaker:
big, big metropolis.
Speaker:
At some point, I want to get away.
Speaker:
But I was like, wait, we're leaving that already at that time where they didn't have
internet, they didn't have cars and so on.
Speaker:
But for them, it was already something that was too many people, too much noise.
Speaker:
I found that super interesting.
Speaker:
For sure.
Speaker:
Wild.
Speaker:
Well, this has been an amazing episode, Alex.
Speaker:
I really am glad that Doug suggested you for the show because...
Speaker:
This has been fantastic.
Speaker:
I've really enjoyed every minute of this.
Speaker:
I wish it could go on forever, but sadly all good things must come to an end.
Speaker:
And so before I let you go, very last thing is do you have other places that we should be
following you?
Speaker:
We're going to have a library of links in the show notes for this episode.
Speaker:
And of course we know about your podcast, learn Bayesian statistics.
Speaker:
We've got the intuitive Bayes.
Speaker:
educational platform and open source libraries like PyMC and RViz.
Speaker:
In addition to those, is there any other social media platform or other way that people
should be following you or getting in touch with you after the program?
Speaker:
Well, yeah, thanks for mentioning that.
Speaker:
So yeah, intuitive base, learning vision statistics, PyMC Labs, you mentioned them.
Speaker:
And I'm always available on Twitter, alex underscore andora.
Speaker:
like the country and that's where it comes from.
Speaker:
Because it has two Rs and not only one.
Speaker:
And so when I say it in another language than Spanish, then people write it with just one
R.
Speaker:
otherwise, LinkedIn also, I'm over there.
Speaker:
So you can always reach out to me over there, LinkedIn or Twitter.
Speaker:
also, yes, me podcast suggestions, stuff like that.
Speaker:
always...
Speaker:
the lookout for something cool.
Speaker:
again, yeah, thanks a lot, for having me on.
Speaker:
Thanks a lot, Doug, for the recommendation.
Speaker:
yeah, that was a blast.
Speaker:
I enjoyed it a lot.
Speaker:
So thank you so much.
Speaker:
This has been another episode of Learning Bayesian Statistics.
Speaker:
Be sure to rate, review, and follow the show on your favorite podcatcher, and visit
learnbaystats .com for more resources about today's topics, as well as access to more
Speaker:
episodes to help you reach true Bayesian state of mind.
Speaker:
That's learnbaystats .com.
Speaker:
Our theme music is Good Bayesian by Baba Brinkman.
Speaker:
Fit MC Lance and Meghiraan.
Speaker:
Check out his awesome work at bababrinkman .com.
Speaker:
I'm your host.
Speaker:
Alex Andorra.
Speaker:
can follow me on Twitter at Alex underscore Andorra, like the country.
Speaker:
You can support the show and unlock exclusive benefits by visiting Patreon .com slash
LearnBasedDance.
Speaker:
Thank you so much for listening and for your support.
Speaker:
You're truly a good Bayesian.
Speaker:
Change your predictions after taking information in.
Speaker:
And if you're thinking I'll be less than amazing, let's adjust those expectations.
Speaker:
Let me show you how to be a good Bayesian Change calculations after taking fresh data in
Those predictions that your brain is making Let's get them on a solid foundation