Learning Bayesian Statistics

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

In this episode, I had the pleasure of speaking with Allen Downey, a professor emeritus at Olin College and a curriculum designer at Brilliant.org. Allen is a renowned author in the fields of programming and data science, with books such as “Think Python” and “Think Bayes” to his credit. He also authors the blog “Probably Overthinking It” and has a new book by the same name, which he just released in December 2023.

In this conversation, we tried to help you differentiate between right and wrong ways of looking at statistical data, discussed the Overton paradox and the role of Bayesian thinking in it, and detailed a mysterious Bayesian killer app!

But that’s not all: we even addressed the claim that Bayesian and frequentist methods often yield the same results — and why it’s a false claim. If that doesn’t get you to listen, I don’t know what will!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie and Cory Kiser.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

Links from the show:

Abstract

by Christoph Bamberg

We are happy to welcome Allen Downey back to ur show and he has great news for us: His new book “Probably Overthinking It” is available now. 

You might know Allen from his blog by the same name or his previous work. Or maybe you watched some of his educational videos which he produces in his new position at brilliant.org.

We delve right into exciting topics like collider bias and how it can explain the “low brith weight paradox” and other situations that only seem paradoxical at first, until you apply causal thinking to it.

Another classic Allen can unmystify for us is Simpson’s paradox. The problem is not the data, but your expectations of the data. We talk about some cases of Simpson’s paradox, for example from statistics on the Covid-19 pandemic, also featured in his book.

We also cover the “Overton paradox” – which Allen named himself – on how people report their ideologies as liberal or conservative over time. 

Next to casual thinking and statistical paradoxes, we return to the common claim that frequentist statistics and Bayesian statistics often give the same results. Allen explains that they are fundamentally different and that Bayesian should not shy away from pointing that out and to emphasise the strengths of their methods.

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.

Transcript
Speaker:

In this episode, I had the pleasure of

speaking with Alan Derny, a professor

2

00:00:09,433 --> 00:00:14,015

emeritus at Allin College and a curriculum

designer at brilliant.org.

3

00:00:14,776 --> 00:00:19,838

Alan is a renowned author in the fields of

programming and data science, with books

4

00:00:19,838 --> 00:00:23,479

such as ThinkPython and ThinkBase to his

credit.

5

00:00:23,559 --> 00:00:28,601

He also authors the blog Probably

Overthinking It, and has a new book by the

6

00:00:28,601 --> 00:00:29,494

same name,

7

00:00:29,494 --> 00:00:32,895

which he just released in December 2023.

8

00:00:33,355 --> 00:00:38,437

In this conversation, we tried to help you

differentiate between right and wrong ways

9

00:00:38,437 --> 00:00:42,319

of looking at statistical data, we

discussed the overtone paradox and the

10

00:00:42,319 --> 00:00:47,461

role of Bayesian thinking in it, and we

detailed a mysterious Bayesian killer app.

11

00:00:47,521 --> 00:00:49,502

But that is not all.

12

00:00:49,502 --> 00:00:53,824

We even addressed the claim that Bayesian

infrequentist method often yield the same

13

00:00:53,824 --> 00:00:57,605

results, and why it is a false claim.

14

00:00:57,822 --> 00:01:00,963

If that doesn't get you to listen, I don't

know what will.

15

00:01:00,963 --> 00:01:07,007

This is Learning Basion Statistics,

episode 97, recorded October 25, 2023.

16

00:01:09,828 --> 00:01:13,691

Hello, Mediabasians!

17

00:01:13,691 --> 00:01:16,252

I have two announcements for you today.

18

00:01:16,252 --> 00:01:21,675

First, congratulations to the 10 patrons

who won a digital copy of Alan's new book.

19

00:01:21,675 --> 00:01:25,666

The publisher will soon get in touch and

send you the link to your free...

20

00:01:25,666 --> 00:01:28,006

digital copy if you didn't win.

21

00:01:28,006 --> 00:01:33,849

Well, you still won, because you get a 30%

discount if you order with the discount

22

00:01:33,849 --> 00:01:38,431

code UCPNew from the UChicagoPress

website.

23

00:01:38,431 --> 00:01:41,092

I put the link in the show notes, of

course.

24

00:01:41,312 --> 00:01:47,374

Second, a huge thank you to Matt Nichols,

Maxime Goussensdorf, Michael Thomas, Luke

25

00:01:47,374 --> 00:01:51,216

Corey and Corey Kaiser for supporting the

show on Patreon.

26

00:01:51,216 --> 00:01:54,870

I can assure you, this is the best way to

start the year.

27

00:01:54,870 --> 00:01:56,491

Thank you so much for your support.

28

00:01:56,491 --> 00:02:00,794

It literally makes this show possible and

it made my day.

29

00:02:00,835 --> 00:02:03,697

Now onto the show with Alan Downey.

30

00:02:03,697 --> 00:02:08,341

Show you how to be a good peasy and change

your predictions.

31

00:02:08,341 --> 00:02:13,846

Alan Downey, welcome back to Learning

Vasion Statistics.

32

00:02:13,846 --> 00:02:14,506

Thank you.

33

00:02:14,506 --> 00:02:15,727

It's great to be here.

34

00:02:15,927 --> 00:02:18,489

Yeah, thanks again for taking the time.

35

00:02:19,050 --> 00:02:23,573

And so for people who know you already,

36

00:02:24,014 --> 00:02:25,615

or getting to know you.

37

00:02:25,615 --> 00:02:30,179

Allen was already on LearnBasedStats in

episode 41.

38

00:02:30,740 --> 00:02:36,665

And so if you are interested in a bit more

detail with his background and also much

39

00:02:36,665 --> 00:02:45,493

more about his previous book, ThinkBased,

I recommend listening back to the episode

40

00:02:45,493 --> 00:02:47,735

41, which will be in the show notes.

41

00:02:50,946 --> 00:02:53,688

focus on other topics, especially your new

book, Alain.

42

00:02:53,688 --> 00:02:55,890

I don't know how you do that.

43

00:02:56,551 --> 00:03:02,756

But well done, congratulations on, again,

another great book that's getting out.

44

00:03:03,558 --> 00:03:09,744

But first, maybe a bit more generally, how

do you define the work that you're doing

45

00:03:09,744 --> 00:03:13,687

nowadays and the topics that you're

particularly interested in?

46

00:03:16,094 --> 00:03:21,495

It's a little hard to describe now because

I was a professor for more than 20 years.

47

00:03:21,495 --> 00:03:25,036

And then I left higher ed about a year, a

year and a half ago.

48

00:03:25,436 --> 00:03:30,778

And so now my day job, I'm at

brilliant.org and I am writing online

49

00:03:30,778 --> 00:03:35,559

lessons for them in programming and data

science, which is great.

50

00:03:35,559 --> 00:03:36,239

I'm enjoying that.

51

00:03:36,239 --> 00:03:36,339

Yeah.

52

00:03:36,339 --> 00:03:39,060

Sounds like fun.

53

00:03:39,060 --> 00:03:39,860

It is.

54

00:03:39,860 --> 00:03:44,081

And then also working on these books and

blogging.

55

00:03:44,361 --> 00:03:45,061

And

56

00:03:45,330 --> 00:03:49,751

I think of it now as almost being like a

gentleman scientist or an independent

57

00:03:50,011 --> 00:03:50,612

scientist.

58

00:03:50,612 --> 00:03:52,673

I think that's my real aspiration.

59

00:03:52,673 --> 00:03:57,254

I want to be an 18th century gentleman

scientist.

60

00:03:58,775 --> 00:04:01,656

I love that.

61

00:04:01,656 --> 00:04:06,918

Yeah, that sounds like a good objective.

62

00:04:07,419 --> 00:04:10,400

Yeah, it definitely sounds like fun.

63

00:04:11,020 --> 00:04:14,681

It also sounds a bit similar to...

64

00:04:14,782 --> 00:04:20,024

what I'm doing on my end with the podcasts

and also the online courses for intuitive

65

00:04:20,024 --> 00:04:20,684

base.

66

00:04:20,684 --> 00:04:25,946

And also I teach a lot of the workshops at

Pimc Labs.

67

00:04:25,946 --> 00:04:32,629

So yeah, a lot of teaching and educational

content on my end too, which I really

68

00:04:32,629 --> 00:04:33,349

love.

69

00:04:33,469 --> 00:04:36,871

So that's also why I do it.

70

00:04:36,871 --> 00:04:43,426

And yeah, it's fun because most of the

time, like you start

71

00:04:43,426 --> 00:04:48,968

teaching a topic and that's a very good

incentive to learn it in lots of details.

72

00:04:48,968 --> 00:04:49,268

Right.

73

00:04:49,268 --> 00:04:56,792

So, lately I've been myself diving way

more into caution processes again, because

74

00:04:56,792 --> 00:05:02,535

this is a very fascinating topic, but

quite complex and causal inference also

75

00:05:02,535 --> 00:05:04,816

I've been reading up again on this.

76

00:05:04,816 --> 00:05:06,796

So it's been quite fun.

77

00:05:07,177 --> 00:05:09,177

What has been on your mind recently?

78

00:05:10,030 --> 00:05:14,012

Well, you mentioned causal inference and

that is certainly a hot topic.

79

00:05:14,012 --> 00:05:17,615

It's one where I always feel I'm a little

bit behind.

80

00:05:17,615 --> 00:05:23,879

I've been reading about it and written

about it a little bit, but I still have a

81

00:05:23,879 --> 00:05:24,620

lot to learn.

82

00:05:24,620 --> 00:05:26,961

So it's an interesting topic.

83

00:05:27,482 --> 00:05:28,522

Yeah, yeah, yeah.

84

00:05:28,883 --> 00:05:34,887

And the cool thing is that honestly, when

you're coming from the Bayesian framework,

85

00:05:35,247 --> 00:05:37,508

to me that feels extremely natural.

86

00:05:37,649 --> 00:05:38,969

It's just a way of...

87

00:05:39,950 --> 00:05:42,871

Some concepts are the same, but they're

just named differently.

88

00:05:42,871 --> 00:05:46,812

So that's all you have to make the

connection in your brain.

89

00:05:46,832 --> 00:05:49,253

And some of them are somewhat new.

90

00:05:49,253 --> 00:05:56,336

But if you've been doing generative

modeling for a while, then just coming up

91

00:05:56,336 --> 00:06:01,819

with the directed acyclic graph for your

model and just updating it from a

92

00:06:01,819 --> 00:06:07,921

generative perspective and doing

counterfactual analysis, it's really,

93

00:06:09,442 --> 00:06:11,222

do it in the Bayesian workflow.

94

00:06:11,222 --> 00:06:15,623

So that's a really good, that really helps

you.

95

00:06:15,623 --> 00:06:18,004

To me, you already have the foundations.

96

00:06:18,144 --> 00:06:25,146

And you just have to, well, kind of add a

bit of a toolbox to it, you know, like,

97

00:06:25,146 --> 00:06:29,227

OK, so what's regression discontinuity

design?

98

00:06:29,807 --> 00:06:32,328

What's interrupted time series?

99

00:06:32,768 --> 00:06:33,628

Things like that.

100

00:06:33,628 --> 00:06:38,609

But otherwise, what's difference in

differences?

101

00:06:38,686 --> 00:06:46,229

things like that, but these are kind of

just techniques that you add on top of the

102

00:06:46,229 --> 00:06:50,371

foundations, but the concepts are pretty

easy to pick up if you've been in a

103

00:06:50,371 --> 00:06:52,152

Bayesian for a while.

104

00:06:52,952 --> 00:06:57,254

I guess that's really the good news for

people who are looking into that.

105

00:06:57,254 --> 00:07:02,637

It's not completely different from what

you've been doing.

106

00:07:03,177 --> 00:07:04,577

No, I think that's right.

107

00:07:04,706 --> 00:07:09,267

And in fact, I have a recommendation for

people if they're coming from Bayes and

108

00:07:09,267 --> 00:07:11,027

getting into causal inference.

109

00:07:11,027 --> 00:07:15,348

Judea Pearl's book, The Book of Why,

follows exactly the progression that you

110

00:07:15,348 --> 00:07:19,790

just described because he starts with

Bayesian nets and then says, well, no,

111

00:07:19,790 --> 00:07:21,490

actually, that's not quite sufficient.

112

00:07:21,490 --> 00:07:25,671

Now for doing causal inference, we need

the next steps.

113

00:07:25,811 --> 00:07:28,712

So that was his professional progression.

114

00:07:28,792 --> 00:07:32,733

And it makes, I think, a good logical

progression for learning these topics.

115

00:07:33,974 --> 00:07:34,674

Yeah, exactly.

116

00:07:34,674 --> 00:07:41,759

And well, funny enough, I've been, I've

started rereading the Book of White

117

00:07:41,759 --> 00:07:42,220

recently.

118

00:07:42,220 --> 00:07:49,005

I had read it like two, three years ago

and I'm reading it again because surely

119

00:07:49,005 --> 00:07:52,407

there are a lot of things that I didn't

pick up at the time, didn't understand.

120

00:07:52,407 --> 00:07:56,890

And there are some stuff that are going to

resonate with me more now that I have a

121

00:07:56,890 --> 00:08:01,413

bit more background, let's say, or...

122

00:08:02,474 --> 00:08:06,998

Some other people would say more wrinkles

on my front head, but I don't know why

123

00:08:06,998 --> 00:08:08,419

they would say that.

124

00:08:10,701 --> 00:08:15,524

So, Alain, already getting off topic, but

yeah, I really love that.

125

00:08:15,845 --> 00:08:19,228

The causal inference stuff has been fun.

126

00:08:19,228 --> 00:08:21,189

I'm teaching that next Tuesday.

127

00:08:21,189 --> 00:08:25,192

First time I'm going to teach three hours

of causal inference.

128

00:08:25,213 --> 00:08:26,714

That's going to be very fun.

129

00:08:27,475 --> 00:08:29,095

I can't wait for it.

130

00:08:31,402 --> 00:08:36,584

Like you try to study the topic and there

are all angles to consider and then a

131

00:08:36,584 --> 00:08:41,486

student will come up with a question that

you're like, huh, I did not think about

132

00:08:41,486 --> 00:08:42,246

that.

133

00:08:42,626 --> 00:08:44,066

Let me come back to you.

134

00:08:45,227 --> 00:08:47,568

That's really the fun stuff to me.

135

00:08:47,568 --> 00:08:51,410

As you say, I think every teacher has that

experience that you really learn something

136

00:08:51,410 --> 00:08:52,530

when you teach it.

137

00:08:53,490 --> 00:08:54,271

Oh yeah.

138

00:08:54,271 --> 00:08:54,731

Yeah, yeah.

139

00:08:54,731 --> 00:08:56,512

I mean, definitely.

140

00:08:56,732 --> 00:08:59,833

That's really one of the best ways for me

to learn.

141

00:09:01,298 --> 00:09:04,859

a deadline, first, I have to teach that

stuff.

142

00:09:04,859 --> 00:09:09,260

And then having a way of talking about the

topic, whether that's teaching or

143

00:09:09,260 --> 00:09:16,642

presenting, is really one of the most

efficient ways of learning, at least to

144

00:09:16,642 --> 00:09:16,742

me.

145

00:09:16,742 --> 00:09:23,684

Because I don't have the personal

discipline to just learn for the sake of

146

00:09:23,684 --> 00:09:24,584

learning.

147

00:09:25,004 --> 00:09:27,425

That doesn't really happen for me.

148

00:09:29,674 --> 00:09:33,855

Now, we might not be as off topic as you

think, because I do have a little bit of

149

00:09:33,855 --> 00:09:35,935

causal inference in the new book.

150

00:09:36,435 --> 00:09:36,956

Oh, yeah?

151

00:09:36,956 --> 00:09:39,296

I've got a section that is about collider

bias.

152

00:09:39,296 --> 00:09:44,258

And this is an example where if you go

back and read the literature in

153

00:09:44,258 --> 00:09:47,619

epidemiology, there is so much confusion.

154

00:09:47,659 --> 00:09:52,000

There was the low birth weight paradox was

one of the first examples, and then the

155

00:09:52,000 --> 00:09:55,281

obesity paradox and the twin paradox.

156

00:09:55,281 --> 00:09:57,901

And they're all baffling.

157

00:09:57,990 --> 00:10:03,955

if you think of it in terms of regression

or statistical association, and then once

158

00:10:03,955 --> 00:10:10,060

you draw the causal diagram and figure out

that you have selected a sample based on a

159

00:10:10,060 --> 00:10:15,224

collider, the light bulb goes on and it's,

oh, of course, now I get it.

160

00:10:15,224 --> 00:10:16,765

This is not a paradox at all.

161

00:10:16,765 --> 00:10:19,607

This is just another form of sampling

bias.

162

00:10:26,686 --> 00:10:33,169

What's a collider for the, I was going to

say the students, for the listeners?

163

00:10:33,269 --> 00:10:42,634

And also then what does collider bias mean

and how do you get around that?

164

00:10:42,634 --> 00:10:46,557

Yeah, no, this was really interesting for

me to learn about as I was writing the

165

00:10:46,557 --> 00:10:47,357

book.

166

00:10:47,377 --> 00:10:51,299

And the example that I started with is the

low birth weight paradox.

167

00:10:51,379 --> 00:10:53,160

And this comes from the 1970s.

168

00:10:53,480 --> 00:10:55,586

It was a researcher in California.

169

00:10:55,586 --> 00:11:00,848

who was studying low birth weight babies

and the effect of maternal smoking.

170

00:11:01,469 --> 00:11:08,413

And he found out that if the mother of a

newborn baby smoked, it is more likely to

171

00:11:08,413 --> 00:11:09,913

be low birth weight.

172

00:11:10,634 --> 00:11:17,157

And low birth weight babies have health

effects, including higher mortality.

173

00:11:17,838 --> 00:11:23,541

But what he found is that if you zoom in

and you just look at the low birth weight

174

00:11:23,541 --> 00:11:24,541

babies,

175

00:11:24,918 --> 00:11:30,781

you would find that the ones whose mother

smoked had better health outcomes,

176

00:11:31,302 --> 00:11:33,723

including lower mortality.

177

00:11:34,664 --> 00:11:38,547

And this was a time, this was in the 70s,

when people knew that cigarette smoking

178

00:11:38,547 --> 00:11:42,950

was bad for you, but it was still, you

know, public health campaigns were

179

00:11:42,950 --> 00:11:46,252

encouraging people to stop smoking, and

especially mothers.

180

00:11:46,953 --> 00:11:51,696

And then this article came out that said

that smoking appears to have some

181

00:11:51,696 --> 00:11:53,657

protective effect.

182

00:11:53,886 --> 00:11:55,727

for low birth weight babies.

183

00:11:56,228 --> 00:12:02,013

That in the normal range of birth weight,

it appears to be minimally harmful and for

184

00:12:02,013 --> 00:12:03,794

low birth weight babies, it's good.

185

00:12:04,095 --> 00:12:10,420

And so, he didn't quite recommend maternal

smoking but he almost did.

186

00:12:11,401 --> 00:12:14,004

And there was a lot of confusion.

187

00:12:14,004 --> 00:12:19,469

It was, I think it wasn't until the 80s

that somebody explained it in terms of

188

00:12:19,469 --> 00:12:20,709

causal inference.

189

00:12:21,158 --> 00:12:26,960

And then finally in the 90s where someone

was able to show using data that not only

190

00:12:26,960 --> 00:12:30,942

was this a mistake, but you could put the

numbers on it and say, look, this is

191

00:12:30,942 --> 00:12:32,362

exactly what's going on.

192

00:12:32,362 --> 00:12:38,125

If you correct for the bias, you will find

that not surprisingly smoking is bad

193

00:12:38,125 --> 00:12:41,466

across the board, even for low birth

weight babies.

194

00:12:41,926 --> 00:12:47,769

So the explanation is that there's a

collider and a collider in a causal graph

195

00:12:47,769 --> 00:12:49,762

means that there are two arrows.

196

00:12:49,762 --> 00:12:55,563

coming into the same box, meaning two

potential causes for the same thing.

197

00:12:55,823 --> 00:12:57,644

So in this case, it's low birth weight.

198

00:12:57,944 --> 00:13:01,885

And here's what I think is the simplest

explanation of the low birth weight

199

00:13:01,885 --> 00:13:07,287

paradox, which is there are two things

that will cause a baby to be low birth

200

00:13:07,287 --> 00:13:13,308

weight, either the mother smoked or

there's something else going on like a

201

00:13:13,308 --> 00:13:14,489

birth defect.

202

00:13:17,302 --> 00:13:20,524

The maternal smoking is relatively benign.

203

00:13:20,824 --> 00:13:28,169

It's not good for you, but it's not quite

as bad as the other effects.

204

00:13:28,169 --> 00:13:30,551

So you could imagine being a doctor.

205

00:13:30,551 --> 00:13:32,752

You've been called in to treat a patient.

206

00:13:32,913 --> 00:13:35,694

The baby is born at a low birth weight.

207

00:13:36,075 --> 00:13:37,256

And now you're worried.

208

00:13:37,256 --> 00:13:40,318

You're saying to yourself, oh, this might

be a birth defect.

209

00:13:40,658 --> 00:13:42,960

And then you find out that the mother

smoked.

210

00:13:43,140 --> 00:13:45,301

You would be relieved.

211

00:13:45,398 --> 00:13:50,619

because that explains the low birth weight

and it decreases the probability that

212

00:13:50,619 --> 00:13:54,300

there's something else worse going on.

213

00:13:54,300 --> 00:13:55,960

So that's the effect.

214

00:13:55,960 --> 00:14:00,081

And again, it's caused because when they

selected the sample, they selected low

215

00:14:00,081 --> 00:14:01,422

birth weight babies.

216

00:14:01,742 --> 00:14:04,463

So in that sense, they selected on a

collider.

217

00:14:05,003 --> 00:14:06,843

And that's where everything goes wrong.

218

00:14:07,003 --> 00:14:07,703

Yeah.

219

00:14:07,704 --> 00:14:14,065

And it's like, I find that really

interesting and fascinating because in a

220

00:14:14,065 --> 00:14:14,705

way,

221

00:14:15,090 --> 00:14:18,732

it comes down to a bias in the sample in a

way here.

222

00:14:20,033 --> 00:14:26,817

But also the like, so here, in a way, you

don't have really any ways of.

223

00:14:28,962 --> 00:14:34,604

doing the analysis without going back to

the data collecting step.

224

00:14:35,145 --> 00:14:43,269

But also, colliders are very tricky in the

sense that if you so you have that path,

225

00:14:43,269 --> 00:14:44,130

as you were saying.

226

00:14:44,130 --> 00:14:50,593

So the collider is a common effect of two

causes.

227

00:14:51,194 --> 00:14:55,736

And the two causes can be completely

unrelated.

228

00:14:58,874 --> 00:15:04,776

As is often said, if you control for the

collider, then it's going to open the path

229

00:15:04,796 --> 00:15:10,699

and it's going to allow information to

flow from, let's say, X to Y and C is the

230

00:15:10,699 --> 00:15:11,699

collider.

231

00:15:11,859 --> 00:15:14,560

X is not related to Y in the causal graph.

232

00:15:14,560 --> 00:15:19,722

But if you control for C, then X is going

to become related to Y.

233

00:15:20,243 --> 00:15:21,963

That's really the tricky thing.

234

00:15:21,963 --> 00:15:26,185

That's why we're telling people, do not

just throw.

235

00:15:27,426 --> 00:15:32,469

predictors at random in your models when

you're doing the linear regression, for

236

00:15:32,469 --> 00:15:32,849

instance.

237

00:15:32,849 --> 00:15:38,113

Because if there is a collider in your

graph, and very probably there is one at

238

00:15:38,113 --> 00:15:44,237

some point, if it's a complicated enough

situation, then you're going to have

239

00:15:44,237 --> 00:15:48,320

spurious statistical correlations which

are not causal.

240

00:15:48,320 --> 00:15:52,002

But you've created that by basically

opening the collider path.

241

00:15:52,002 --> 00:15:57,050

So the good news is that the path is

closed if you like.

242

00:15:57,050 --> 00:15:58,090

naturally.

243

00:15:58,130 --> 00:16:02,752

So if you don't control for that, if you

don't add that in your model, you're good.

244

00:16:02,973 --> 00:16:07,595

But if you start adding just predictors

all over the place, you're very probably

245

00:16:07,595 --> 00:16:10,617

going to create collider biases like that.

246

00:16:10,617 --> 00:16:16,960

So that's why it's not as easy when you

have a count found, which is kind of the

247

00:16:16,960 --> 00:16:18,581

opposite situation.

248

00:16:18,581 --> 00:16:21,683

So let's say now C is the common cause of

x and y.

249

00:16:21,683 --> 00:16:26,805

Well, then if you have a count found, you

want to block the pass.

250

00:16:26,978 --> 00:16:32,661

the path that's going from X to Y through

C to see if there is a path, direct path

251

00:16:32,661 --> 00:16:34,322

from X to Y.

252

00:16:34,402 --> 00:16:37,664

Then you want to control for C, but if

it's a collider, you don't.

253

00:16:37,664 --> 00:16:40,406

So that's why, like, don't control for

everything.

254

00:16:40,406 --> 00:16:46,309

Don't put predictors all over the place

because that can be very tricky.

255

00:16:47,690 --> 00:16:52,593

Yeah, and I think that's a really valuable

insight because when people start playing

256

00:16:52,593 --> 00:16:53,714

with regression,

257

00:16:53,714 --> 00:16:57,635

Sure, they just, you know, you add more to

the model, more is better.

258

00:16:57,855 --> 00:17:04,578

And yes, once you think about colliders

and mediators, and I think this vocabulary

259

00:17:04,578 --> 00:17:10,061

is super helpful for thinking about these

problems, you know, understanding what

260

00:17:10,061 --> 00:17:13,342

should and shouldn't be in your model if

what you're trying to do is causal.

261

00:17:13,342 --> 00:17:17,544

Yeah.

262

00:17:17,544 --> 00:17:20,362

And that's also definitely something I...

263

00:17:20,362 --> 00:17:20,902

can see a lot.

264

00:17:20,902 --> 00:17:23,003

It depends on where the students are

coming from.

265

00:17:23,003 --> 00:17:27,665

But yeah, where it's like they show me a

regression with, I don't know, 10

266

00:17:27,665 --> 00:17:29,065

predictors already.

267

00:17:29,785 --> 00:17:30,766

And then I can't.

268

00:17:30,766 --> 00:17:32,687

I swear the model doesn't make really

sense.

269

00:17:32,687 --> 00:17:36,708

I'm like, wait, did you try with less

predictors?

270

00:17:36,708 --> 00:17:41,490

Like, you just do first the model with

just an intercept and then build up from

271

00:17:41,490 --> 00:17:42,210

that?

272

00:17:42,831 --> 00:17:46,552

And no, often it turns out it's the first

version of the model with 10 predictors.

273

00:17:46,552 --> 00:17:48,053

So you're like, oh, wait.

274

00:17:52,499 --> 00:17:56,241

Look at that again from another

perspective, from a more minimalist

275

00:17:56,241 --> 00:17:58,722

perspective.

276

00:17:58,722 --> 00:18:00,103

But that's awesome.

277

00:18:00,543 --> 00:18:02,944

I really love that you're talking about

that in the book.

278

00:18:04,365 --> 00:18:08,488

I recommend people then looking at it

because it's not only very interesting,

279

00:18:08,488 --> 00:18:16,492

it's also very important if you're looking

into, well, are my models telling me

280

00:18:16,492 --> 00:18:17,772

something valuable?

281

00:18:17,793 --> 00:18:18,573

Are they?

282

00:18:20,066 --> 00:18:24,609

helping me understand what's going on or

is it just something that helps me predict

283

00:18:24,609 --> 00:18:25,070

better?

284

00:18:25,070 --> 00:18:28,933

But other than that, I cannot say a lot.

285

00:18:28,933 --> 00:18:32,596

So definitely listeners refer to that.

286

00:18:32,596 --> 00:18:39,541

And actually, the URL editor was really

kind to me and Alan because, well, first

287

00:18:39,541 --> 00:18:43,324

10 of the patrons are going to get the

book for free at random.

288

00:18:43,324 --> 00:18:45,406

So thank you so much.

289

00:18:50,542 --> 00:18:56,366

link that you have in the show notes, you

can buy the book at a 30% discount.

290

00:18:56,406 --> 00:19:00,309

So, even if you don't win, you will win.

291

00:19:00,309 --> 00:19:06,873

So, definitely go there and buy the book,

or if you're a patron, enter the random

292

00:19:06,873 --> 00:19:12,557

draw, and we'll see what randomness has in

stock for you.

293

00:19:13,418 --> 00:19:18,314

And actually, so we already started diving

in one of your chapters, but

294

00:19:18,314 --> 00:19:26,099

Maybe let's take a step back and can you

provide an overview of your new book

295

00:19:26,099 --> 00:19:32,884

that's called Probably Overthinking It and

what inspired you to write it?

296

00:19:32,884 --> 00:19:37,667

Yeah, well, Probably Overthinking It is

the name of my blog from more than 10

297

00:19:37,667 --> 00:19:38,787

years ago.

298

00:19:38,888 --> 00:19:43,091

And so one of the things that got this

project started was kind of a greatest

299

00:19:43,091 --> 00:19:44,932

hits from the blog.

300

00:19:44,932 --> 00:19:47,358

There were a number of articles that

had...

301

00:19:47,358 --> 00:19:50,860

either got a lot of attention or where I

thought there was something really

302

00:19:50,860 --> 00:19:56,944

important there that I wanted to collect

and present a little bit more completely

303

00:19:56,944 --> 00:19:59,226

and more carefully in a book.

304

00:19:59,226 --> 00:20:01,647

So that's what started it.

305

00:20:01,707 --> 00:20:08,272

And it was partly like a collection of

puzzles, a collection of paradoxes, the

306

00:20:08,272 --> 00:20:10,914

strange things that we see in data.

307

00:20:10,914 --> 00:20:16,597

So like Collider Bias, which is Berkson's

paradox is the other name for that.

308

00:20:17,302 --> 00:20:19,223

There's Simpson's paradox.

309

00:20:19,223 --> 00:20:21,705

There's one paradox after another.

310

00:20:22,566 --> 00:20:25,769

And that's when I started, I thought that

was what the book was going to be about.

311

00:20:25,769 --> 00:20:28,251

It was, here are all these interesting

puzzles.

312

00:20:28,251 --> 00:20:29,572

Let's think about them.

313

00:20:29,913 --> 00:20:35,958

But then what I found in every chapter was

that there was at least one example that

314

00:20:35,958 --> 00:20:41,583

bubbled up where these paradoxes were

having real effects in the world.

315

00:20:41,583 --> 00:20:44,385

People were getting things genuinely

wrong.

316

00:20:44,445 --> 00:20:44,938

And.

317

00:20:44,938 --> 00:20:51,183

those errors had consequences for public

health, for criminal justice, for all

318

00:20:51,183 --> 00:20:54,566

kinds of real things that affect real

lives.

319

00:20:54,886 --> 00:21:01,672

And that's where the book kind of took a

turn toward not so much the paradox

320

00:21:01,672 --> 00:21:08,458

because it's fun to think about, although

it is, but the places where we use data to

321

00:21:08,458 --> 00:21:12,341

make better decisions and get better

outcomes.

322

00:21:12,766 --> 00:21:17,291

And then a little bit of the warnings

about what can go wrong when we make some

323

00:21:17,291 --> 00:21:19,514

of these errors.

324

00:21:19,514 --> 00:21:25,702

And most of them boil down, when you think

about it, to one form of sampling bias or

325

00:21:25,702 --> 00:21:26,063

another.

326

00:21:26,063 --> 00:21:31,068

That should be the subtitle of this book

is like 12 chapters of sampling bias.

327

00:21:33,770 --> 00:21:40,893

Yeah, I mean, that's really interesting to

see that a lot of problems come from

328

00:21:40,893 --> 00:21:46,175

sampling biases, which is almost

disappointing in the sense that it sounds

329

00:21:46,175 --> 00:21:47,375

really simple.

330

00:21:48,476 --> 00:21:54,398

But I mean, as we can see in your book,

it's maybe easy to understand the problem,

331

00:21:54,398 --> 00:21:58,500

but then solving it is not necessarily

easy.

332

00:21:58,500 --> 00:22:00,521

So that's one thing.

333

00:22:01,221 --> 00:22:02,641

And then I'm wondering.

334

00:22:04,098 --> 00:22:11,203

How would you say, probably over thinking

it helps the readers differentiate between

335

00:22:11,564 --> 00:22:15,167

the right and wrong ways of looking at

statistical data?

336

00:22:17,129 --> 00:22:21,232

Yeah, I think there are really two

messages in this book.

337

00:22:21,572 --> 00:22:30,059

One of them is the optimistic view that we

can use data to answer questions and

338

00:22:30,059 --> 00:22:33,190

settle debates and make better decisions.

339

00:22:33,190 --> 00:22:35,230

and we will be better off if we do.

340

00:22:35,691 --> 00:22:39,152

And most of the time, it's not super hard.

341

00:22:39,372 --> 00:22:44,994

If you can find or collect the right data,

most of the time you don't need fancy

342

00:22:44,994 --> 00:22:50,336

statistics to answer the questions you

care about with the right data.

343

00:22:50,396 --> 00:22:55,579

And usually a good data visualization, you

can show what you wanna show in a

344

00:22:55,579 --> 00:22:57,279

compelling way.

345

00:22:57,279 --> 00:22:58,660

So that's the good news.

346

00:22:59,120 --> 00:23:01,821

And then the bad news is these warnings.

347

00:23:03,518 --> 00:23:08,740

I think the key to these things is to

think about them and to see a lot of

348

00:23:08,740 --> 00:23:09,860

examples.

349

00:23:10,640 --> 00:23:13,742

And I'll take like Simpson's paradox as an

example.

350

00:23:13,742 --> 00:23:18,604

If you take an intro stats class, you

might see one or two examples.

351

00:23:18,604 --> 00:23:22,945

And I think you come away thinking that

it's just weird, like, oh, those were

352

00:23:22,945 --> 00:23:27,127

really confusing and I'm not sure I really

understand what's happening.

353

00:23:33,866 --> 00:23:38,108

where at some point you start thinking

about Simpson's paradox and you just

354

00:23:38,108 --> 00:23:41,870

realize that there's no paradox there.

355

00:23:41,950 --> 00:23:46,633

It's just a thing that can happen because

why not?

356

00:23:46,633 --> 00:23:53,396

If you have different groups and you plot

a line that connects the two groups, that

357

00:23:53,396 --> 00:23:55,137

line might have one slope.

358

00:23:55,617 --> 00:24:00,140

And then when you zoom in and look at one

of those groups in isolation and plot a

359

00:24:00,140 --> 00:24:03,381

line through it, there's just no reason.

360

00:24:03,618 --> 00:24:08,580

that second line within the group should

have the same slope as the line that

361

00:24:08,580 --> 00:24:10,460

connects the different groups.

362

00:24:10,961 --> 00:24:15,342

And so I think that's an example where

when you see a lot of examples, it changes

363

00:24:15,342 --> 00:24:17,423

the way you think about the thing.

364

00:24:17,843 --> 00:24:23,566

Not from, oh, this is a weird, confusing

thing to, well, actually, it's not a thing

365

00:24:23,566 --> 00:24:24,546

at all.

366

00:24:24,946 --> 00:24:28,428

The only thing that was confusing is that

my expectation was wrong.

367

00:24:30,189 --> 00:24:30,789

Yeah, true.

368

00:24:30,789 --> 00:24:31,329

Yeah, I love that.

369

00:24:31,329 --> 00:24:32,349

I agree.

370

00:24:32,850 --> 00:24:40,436

always found it a bit weird to call all

these phenomenon paradoxes in a way.

371

00:24:40,636 --> 00:24:50,224

Because as you're saying, it's more prior

expectation that makes it a paradox.

372

00:24:50,284 --> 00:25:00,573

Whereas, why should nature obey our simple

minds and priors?

373

00:25:01,778 --> 00:25:04,900

there is nothing that says it should.

374

00:25:04,900 --> 00:25:10,404

And so most of the time, it's just that,

well, reality is not the way we thought it

375

00:25:10,404 --> 00:25:11,184

was.

376

00:25:12,005 --> 00:25:12,625

That's OK.

377

00:25:12,625 --> 00:25:16,988

And I mean, in a way, thankfully,

otherwise, it would be quite boring.

378

00:25:17,088 --> 00:25:25,614

But yeah, that's a bit like when data is

dispersed a lot, there is a lot of

379

00:25:25,614 --> 00:25:26,775

variability in the data.

380

00:25:26,775 --> 00:25:31,277

And then we tend to say data is over

dispersed.

381

00:25:31,838 --> 00:25:33,419

which I always find weird.

382

00:25:33,439 --> 00:25:35,820

It's like, well, it's not the data that's

over dispersed.

383

00:25:35,820 --> 00:25:38,081

It's the model that's under dispersed.

384

00:25:38,682 --> 00:25:40,343

The data doesn't have to do anything.

385

00:25:40,343 --> 00:25:42,924

It's the model that has to adapt to the

data.

386

00:25:43,465 --> 00:25:45,025

So just adapt the model.

387

00:25:45,286 --> 00:25:53,350

But yeah, it's a fun way of phrasing it,

whereas it's like it's the data's fault.

388

00:25:53,350 --> 00:25:54,431

But no, not really.

389

00:25:54,431 --> 00:25:57,793

It's just, well, it's just a lot of

variation.

390

00:25:57,793 --> 00:25:59,158

And.

391

00:25:59,158 --> 00:26:03,760

And that made me think actually the

Simpson paradox that also made me think

392

00:26:03,760 --> 00:26:11,565

about, did you see that recent paper by, I

mean from this year, so it's quite recent

393

00:26:11,565 --> 00:26:16,287

for a paper from Andrew Gellman, Jessica

Hellman, and Lauren Kennedy about the

394

00:26:16,287 --> 00:26:18,589

causal quartets?

395

00:26:18,589 --> 00:26:21,130

No, I missed it.

396

00:26:21,130 --> 00:26:24,832

Awesome, well I'll send that away and I'll

put that on the show notes.

397

00:26:24,832 --> 00:26:27,173

But basically the idea is,

398

00:26:27,494 --> 00:26:32,235

taking Simpson's paradox, but instead of

looking at it from a correlation

399

00:26:32,235 --> 00:26:35,516

perspective, looking at it from a causal

perspective.

400

00:26:35,756 --> 00:26:37,917

And so that's basically the same thing.

401

00:26:37,917 --> 00:26:42,518

It's different ways to get the same

average treatment effect.

402

00:26:42,658 --> 00:26:48,320

So, you know, like Simpson's paradox where

you have four different data points and

403

00:26:48,320 --> 00:26:54,821

you get the same correlation between them,

well, here you have four different

404

00:26:55,138 --> 00:26:59,098

causal structures that give you different

data points.

405

00:26:59,639 --> 00:27:03,700

But if you just look at the average

treatment effect, you will think that it's

406

00:27:03,700 --> 00:27:06,661

the same for the four, whereas it's not.

407

00:27:06,661 --> 00:27:11,862

You know, so the point is also, well,

that's why you should not only look at the

408

00:27:11,862 --> 00:27:13,583

average treatment effect, right?

409

00:27:13,583 --> 00:27:18,024

Look at the whole distribution of

treatment effects, because if you just

410

00:27:18,024 --> 00:27:21,205

look at the average, you might be in a

situation where the population is really

411

00:27:21,205 --> 00:27:24,006

not diverse and then yeah, the average

treatment effect is fake.

412

00:27:24,006 --> 00:27:25,807

effect is something representative.

413

00:27:25,807 --> 00:27:35,474

But what if you're in a very dispersed

population and the treatment effects can

414

00:27:35,474 --> 00:27:39,837

be very negative or very positive, but

then if you look at the averages, it looks

415

00:27:39,837 --> 00:27:42,039

like there is no average treatment effect.

416

00:27:42,039 --> 00:27:44,881

So then you could conclude that there is

no treatment effect, whereas there is

417

00:27:44,881 --> 00:27:48,563

actually a big treatment effect just that

when you look at the average, it cancels

418

00:27:48,563 --> 00:27:49,244

out.

419

00:27:49,724 --> 00:27:51,825

So yeah, like the...

420

00:27:52,146 --> 00:27:55,106

The idea of the paper is the main idea is

that.

421

00:27:55,447 --> 00:28:01,708

And that's, I mean, I think this will be

completely trivial to you, but I think

422

00:28:01,708 --> 00:28:09,690

it's a good way of teaching this, where

you can, if you just look at the average,

423

00:28:10,851 --> 00:28:13,512

you can get beaten by that later on.

424

00:28:13,652 --> 00:28:15,952

Because basically, if you're average, you

summarize.

425

00:28:15,952 --> 00:28:18,573

And if you summarize, you're looking some

information somewhere.

426

00:28:18,573 --> 00:28:20,010

So you're young.

427

00:28:20,010 --> 00:28:24,393

You have to cut some dimension of

information to average naturally.

428

00:28:24,393 --> 00:28:27,335

So if you do that, it comes at a cost.

429

00:28:28,236 --> 00:28:32,239

And the paper does a good job at showing

that.

430

00:28:32,500 --> 00:28:36,503

Yes, that's really interesting because

maybe coincidentally, this is something

431

00:28:36,503 --> 00:28:43,850

that I was thinking about recently,

looking at the evidence for pharmaceutical

432

00:28:43,850 --> 00:28:45,491

treatments for depression.

433

00:28:45,831 --> 00:28:48,733

There was a meta-analysis a few months

ago.

434

00:28:48,982 --> 00:28:55,245

that really showed quite modest treatment

effects, that the average is not great.

435

00:28:56,286 --> 00:29:02,850

And the conclusion that the paper drew was

that the medications were effective for

436

00:29:02,850 --> 00:29:08,074

some people and they said something like

15%, which is also not great, but

437

00:29:08,074 --> 00:29:14,617

effective for 15% and ineffective or

minimally effective for others.

438

00:29:14,770 --> 00:29:18,691

And I was actually surprised by that

result because it was not clear to me how

439

00:29:18,691 --> 00:29:25,234

they were distinguishing between having a

very modest effect for everybody or a

440

00:29:25,234 --> 00:29:30,936

large effect for a minority that was

averaged in with a zero effect for

441

00:29:30,936 --> 00:29:34,818

everybody else, or even the example that

you mentioned, which is that you could

442

00:29:34,818 --> 00:29:39,340

have something that's highly effective for

one group and detrimental for another

443

00:29:39,340 --> 00:29:40,120

group.

444

00:29:40,120 --> 00:29:43,641

And exactly as you said, if you're only

looking at the mean, you can't tell the

445

00:29:43,641 --> 00:29:44,521

difference.

446

00:29:45,042 --> 00:29:50,526

But what I don't know and I still want to

find out is in this study, how did they

447

00:29:50,526 --> 00:29:56,411

draw the conclusion that they drew, which

is they specified that it's effective for

448

00:29:56,411 --> 00:29:58,813

15% and not for others.

449

00:29:59,033 --> 00:30:04,298

So yeah, I'll definitely read that paper

and see if I can connect it with that

450

00:30:04,298 --> 00:30:05,299

research I was looking at.

451

00:30:05,299 --> 00:30:05,899

Yeah.

452

00:30:07,080 --> 00:30:12,324

Yeah, I'll send it to you and I already

put it in the show notes for people who

453

00:30:12,324 --> 00:30:13,505

want to dig deeper.

454

00:30:13,958 --> 00:30:21,922

And I mean, that's a very common pitfall,

especially in the social sciences, where

455

00:30:22,302 --> 00:30:28,125

doing big experiments with lots of

subjects is hard and very costly.

456

00:30:28,666 --> 00:30:31,827

And so often you're doing inferences on

very small groups.

457

00:30:32,247 --> 00:30:36,970

And that's even more complicated to just

look at the average treatment effect.

458

00:30:36,970 --> 00:30:39,791

It can be very problematic.

459

00:30:39,792 --> 00:30:42,933

And interestingly, I talked about that.

460

00:30:44,082 --> 00:30:53,567

I mentioned that paper first in episode 89

with Eric Trexler, who works on the

461

00:30:53,567 --> 00:30:56,669

science of nutrition and exercise,

basically.

462

00:30:56,669 --> 00:31:03,773

So in this field, especially, it's very

hard to have big samples when they do

463

00:31:04,134 --> 00:31:05,134

experiments.

464

00:31:05,134 --> 00:31:09,337

And so most of the time, they have 10, 20

people per group.

465

00:31:11,638 --> 00:31:16,139

is like each time I read that literature,

first they don't use patient stats a lot.

466

00:31:16,139 --> 00:31:23,201

And I'm like, with so low sample sizes,

it's, I'm like, yeah, you should use more,

467

00:31:23,201 --> 00:31:28,522

use BRMS, use BAMB, if you don't really

know how to do the models, but really, you

468

00:31:28,522 --> 00:31:29,283

should.

469

00:31:30,863 --> 00:31:36,285

And also, if you do that, and then you

also only look at the average treatment

470

00:31:36,285 --> 00:31:37,145

effects.

471

00:31:38,065 --> 00:31:39,254

I'm guessing you have.

472

00:31:39,254 --> 00:31:44,216

big uncertainties on the conclusions you

can draw.

473

00:31:44,216 --> 00:31:49,038

So yeah, I will put that episode also in

the show notes for people who when I

474

00:31:49,038 --> 00:31:53,479

referred to it, that was a very

interesting episode where we talked about

475

00:31:53,739 --> 00:31:59,342

exercise science, nutrition, how that

relates to weight management and how from

476

00:31:59,342 --> 00:32:06,745

an anthropological perspective, also how

the body reacts to these effects.

477

00:32:09,402 --> 00:32:13,784

mostly will fight you when you're trying

to lose a lot of weight, but doesn't

478

00:32:13,784 --> 00:32:16,425

really fight you when you gain a lot of

weight.

479

00:32:16,805 --> 00:32:24,808

And that's also very interesting to know

about these things, especially with the

480

00:32:24,808 --> 00:32:31,111

rampant amount of obesity in the Western

societies where it's really concerning.

481

00:32:31,111 --> 00:32:37,873

And so these signs helps understand what's

going on and how also we can help.

482

00:32:38,222 --> 00:32:46,306

people getting into more trajectories that

are better for their health, which is the

483

00:32:46,306 --> 00:32:52,029

main point basically of that research.

484

00:32:53,170 --> 00:32:59,473

I'm also wondering, if your book, when you

wrote it, and especially now that you've

485

00:32:59,473 --> 00:33:07,797

written it, what would you say, what do

you see as the key takeaways for readers?

486

00:33:08,074 --> 00:33:17,377

And especially for readers who may not

have a strong background in statistics.

487

00:33:17,377 --> 00:33:25,139

Part of it is I hope that it's empowering

in the sense that people will feel like

488

00:33:25,459 --> 00:33:28,400

they can use data to answer questions.

489

00:33:28,740 --> 00:33:32,701

As I said before, it often doesn't require

fancy statistics.

490

00:33:32,941 --> 00:33:33,601

So...

491

00:33:34,222 --> 00:33:35,582

There are two parts of this, I think.

492

00:33:35,582 --> 00:33:40,525

And one part is as a consumer of data, you

don't have to be powerless.

493

00:33:40,525 --> 00:33:47,089

You can read data journalism and

understand the analysis that they did,

494

00:33:47,089 --> 00:33:54,132

interpret the figures and maintain an

appropriate level of skepticism.

495

00:33:54,773 --> 00:34:02,297

In my classes, I sometimes talk about

this, a skeptometer, where if you believe

496

00:34:02,297 --> 00:34:03,677

everything that you read,

497

00:34:04,054 --> 00:34:06,674

That is clearly a problem.

498

00:34:06,995 --> 00:34:11,717

But at the other extreme, I often

encounter students who have become so

499

00:34:11,717 --> 00:34:16,139

skeptical of everything that they read

that they just won't accept an answer to a

500

00:34:16,139 --> 00:34:17,919

question ever.

501

00:34:18,540 --> 00:34:21,141

Because there's always something wrong

with a study.

502

00:34:21,141 --> 00:34:26,963

You can always look at a statistical

argument and find a potential flaw.

503

00:34:27,403 --> 00:34:32,285

But that's not enough to just dismiss

everything that you read.

504

00:34:33,454 --> 00:34:37,375

If you think you have found a potential

flaw, there's still a lot of work to do to

505

00:34:37,375 --> 00:34:45,699

show that actually that flaw is big enough

to affect the outcome substantially.

506

00:34:45,919 --> 00:34:50,581

So I think one of my hopes is that people

will come away with a well-calibrated

507

00:34:50,581 --> 00:34:57,224

skeptometer, which is to look at things

carefully and think about the kinds of

508

00:34:57,224 --> 00:35:01,605

errors that there can be, but also take

the win.

509

00:35:01,906 --> 00:35:07,927

If we have the data and we come up with a

satisfactory answer, you can accept that

510

00:35:07,927 --> 00:35:11,108

question as provisionally answered.

511

00:35:11,628 --> 00:35:15,129

Of course, it's always possible that

something will come along later and show

512

00:35:15,129 --> 00:35:20,951

that we got it wrong, but provisionally,

we can use that answer to make good

513

00:35:20,951 --> 00:35:22,051

decisions.

514

00:35:22,152 --> 00:35:25,312

And by and large, we are better off.

515

00:35:25,853 --> 00:35:28,393

This is my argument for evidence and

reason.

516

00:35:28,930 --> 00:35:32,991

But by and large, if we make decisions

that are based on evidence and reason, we

517

00:35:32,991 --> 00:35:35,871

are better off than if we don't.

518

00:35:35,871 --> 00:35:36,912

Yeah, yeah.

519

00:35:37,392 --> 00:35:38,872

I mean, of course I agree with that.

520

00:35:38,872 --> 00:35:42,973

It's like preaching to the choir.

521

00:35:42,973 --> 00:35:44,694

It shouldn't be controversial.

522

00:35:44,694 --> 00:35:46,935

No, yeah, for sure.

523

00:35:46,935 --> 00:35:53,196

A difficulty I have though is how do you

explain people they should care?

524

00:35:53,716 --> 00:35:54,437

You know?

525

00:35:55,797 --> 00:35:56,937

Why do you think...

526

00:35:57,974 --> 00:36:03,155

we should care about even making decisions

based on data.

527

00:36:03,555 --> 00:36:05,896

Why is that even important?

528

00:36:05,896 --> 00:36:07,816

Because that's just more work.

529

00:36:08,636 --> 00:36:11,237

So why should people care?

530

00:36:12,558 --> 00:36:18,979

Well, that's where, as I said, in every

chapter, something bubbled up where I was

531

00:36:18,979 --> 00:36:23,160

a little bit surprised and said, this

thing that I thought was just kind of an

532

00:36:23,160 --> 00:36:25,801

academic puzzle actually matters.

533

00:36:25,801 --> 00:36:27,421

People are getting it wrong.

534

00:36:27,574 --> 00:36:28,734

because of this.

535

00:36:28,834 --> 00:36:33,195

And there are examples in the book,

several from public health, several from

536

00:36:33,195 --> 00:36:36,916

criminal justice, where we don't have a

choice about making decisions.

537

00:36:36,916 --> 00:36:39,036

We're making decisions all the time.

538

00:36:39,037 --> 00:36:42,157

The only choice is whether they're

informed or not.

539

00:36:43,098 --> 00:36:47,219

And so one of the example, actually,

Simpson's paradox is a nice example.

540

00:36:47,219 --> 00:36:50,039

Let me see if I remember this.

541

00:36:50,120 --> 00:36:54,761

It came from a journalist, and I

deliberately don't name him in the book

542

00:36:54,761 --> 00:36:57,541

because I just don't want to give him any

publicity at all.

543

00:36:57,666 --> 00:37:06,991

but the Atlantic magazine named him the

pandemic's wrongest man because he made a

544

00:37:06,991 --> 00:37:13,455

career out of committing statistical

errors and misleading people.

545

00:37:13,455 --> 00:37:17,937

And he actually features in two chapters

because he commits the base rate fallacy

546

00:37:17,937 --> 00:37:25,341

in one and then gets fooled by Simpson's

paradox in another.

547

00:37:25,806 --> 00:37:31,847

And if I remember right, in the Simpsons

Paradox example, he looked at people who

548

00:37:31,847 --> 00:37:37,249

were vaccinated and compared them to

people who were not vaccinated and found

549

00:37:37,249 --> 00:37:45,012

that during a particular period of time in

the UK, the death rate was higher for

550

00:37:45,012 --> 00:37:48,093

people who were vaccinated.

551

00:37:48,093 --> 00:37:51,593

The death rate was lower for people who

had not been vaccinated.

552

00:37:52,394 --> 00:37:54,994

So on the face of it, okay, well, that's

surprising.

553

00:37:54,994 --> 00:37:57,035

Okay, that's something we need to explain.

554

00:37:57,455 --> 00:38:03,397

It turns out to be an example of Simpson's

paradox, which is the group that he was

555

00:38:03,397 --> 00:38:11,919

looking at was a very wide age range from

I think 15 to 89 or something like that.

556

00:38:12,820 --> 00:38:18,381

And at that point in time during the

pandemic, by and large, the older people

557

00:38:18,590 --> 00:38:23,252

had been vaccinated and younger people had

not, because that was the priority

558

00:38:23,312 --> 00:38:26,434

ordering when the vaccines came out.

559

00:38:26,434 --> 00:38:31,437

So in the group that he compared, the ones

who were vaccinated were substantially

560

00:38:31,437 --> 00:38:34,479

older than the ones who were unvaccinated.

561

00:38:35,079 --> 00:38:40,643

And the death rates, of course, were much

higher in older age groups.

562

00:38:40,643 --> 00:38:43,744

So that explained it.

563

00:38:46,450 --> 00:38:50,893

range of ages together into one group, you

saw one effect.

564

00:38:50,933 --> 00:38:56,338

And if you broke it up into small age

ranges, that effect reversed itself.

565

00:38:56,338 --> 00:38:59,160

So it was a Simpson's paradox.

566

00:38:59,861 --> 00:39:04,165

If you appropriately break people up by

age, you would find that in every single

567

00:39:04,165 --> 00:39:11,130

age group, death rates were lower among

the vaccinated, just as you would expect

568

00:39:11,130 --> 00:39:13,672

if the vaccine was safe and effective.

569

00:39:17,254 --> 00:39:24,236

And that's also where I feel like if you

start thinking about the causal graph, you

570

00:39:24,236 --> 00:39:27,797

know, and the causal structure, that's

also where that would definitely help.

571

00:39:27,797 --> 00:39:29,157

Because it's not that hard, right?

572

00:39:29,157 --> 00:39:30,897

The idea here is not hard.

573

00:39:30,897 --> 00:39:32,878

It's not even hard mathematically.

574

00:39:32,878 --> 00:39:36,659

I think anybody can understand it even if

they don't have a mathematical background.

575

00:39:37,719 --> 00:39:41,680

So yeah, it's mainly that.

576

00:39:41,680 --> 00:39:45,081

And I think the most important point is

that, yeah.

577

00:39:46,262 --> 00:39:51,663

matters because it affects decisions in

the real world.

578

00:39:52,123 --> 00:39:57,445

That thing has literally life and death

consequences.

579

00:39:59,345 --> 00:40:04,807

I'm glad you mentioned it because you do

discuss the base rate fallacy and its

580

00:40:04,807 --> 00:40:08,828

connection to Bayesian thinking in the

book, right?

581

00:40:15,426 --> 00:40:21,411

It starts with the example that everybody

uses, which is interpreting the results of

582

00:40:21,411 --> 00:40:22,732

a medical test.

583

00:40:23,072 --> 00:40:27,656

Because that's a case that's surprising

when you first hear about it and where

584

00:40:27,656 --> 00:40:32,920

Bayesian thinking clarifies the picture

completely.

585

00:40:33,021 --> 00:40:36,804

Once you get your head around it, it is

like these other examples.

586

00:40:37,725 --> 00:40:40,767

Not only gets explained, it stops being

surprising.

587

00:40:42,509 --> 00:40:43,389

And this I'll...

588

00:40:43,530 --> 00:40:47,332

Give the example, I'm sure this is

familiar to a lot of your listeners, but

589

00:40:47,332 --> 00:40:53,495

if you take a medical test, let's take a

COVID test as an example, and suppose that

590

00:40:53,495 --> 00:40:59,999

the test is accurate, 90% accurate, and

let's suppose that means both specificity

591

00:40:59,999 --> 00:41:01,360

and sensitivity.

592

00:41:01,360 --> 00:41:06,042

So if you have the condition, there's a

90% chance that you correctly get a

593

00:41:06,042 --> 00:41:07,203

positive test.

594

00:41:07,203 --> 00:41:11,125

If you don't have the condition, there's a

90% chance that you correctly get a

595

00:41:11,125 --> 00:41:12,225

negative test.

596

00:41:12,542 --> 00:41:17,324

And so now the question is, you take the

test, it comes back positive, what's the

597

00:41:17,324 --> 00:41:19,585

probability that you have the condition?

598

00:41:20,606 --> 00:41:26,209

And that's where people kind of jump onto

that accuracy statistic.

599

00:41:26,309 --> 00:41:30,872

And they think, well, the test is 90%

accurate, so there's a 90% chance that I

600

00:41:30,872 --> 00:41:33,473

have, let's say, COVID in this example.

601

00:41:33,893 --> 00:41:40,137

And that can be totally wrong, depending

on the base rate or invasion terms,

602

00:41:40,137 --> 00:41:41,537

depending on the prior.

603

00:41:42,006 --> 00:41:45,708

And here's where the Bayesian thinking

comes out, which is that different people

604

00:41:45,708 --> 00:41:49,210

are going to have very different priors in

this case.

605

00:41:49,490 --> 00:41:55,713

If if you know that you were exposed to

somebody with COVID three days later, you

606

00:41:55,713 --> 00:41:56,974

feel a scratchy throat.

607

00:41:56,974 --> 00:42:00,656

The next day you wake up with flu

symptoms.

608

00:42:00,656 --> 00:42:05,339

Before you even take a test, I'm going to

say there's at least a 50% chance that you

609

00:42:05,339 --> 00:42:07,300

have COVID, maybe higher.

610

00:42:07,300 --> 00:42:08,400

Could be a cold.

611

00:42:08,401 --> 00:42:09,801

So, you know, it's not 100%.

612

00:42:10,486 --> 00:42:11,866

So let's say it's 50-50.

613

00:42:13,047 --> 00:42:14,728

You take this COVID test.

614

00:42:14,789 --> 00:42:19,352

And let's say, again, 90% accuracy, which

is lower than the home test.

615

00:42:19,352 --> 00:42:20,773

So I'm being a little bit unfair here.

616

00:42:20,773 --> 00:42:21,953

But let's say 90%.

617

00:42:23,375 --> 00:42:25,156

Your prior was 50-50.

618

00:42:25,916 --> 00:42:28,738

The likelihood ratio is about 9 to 1.

619

00:42:29,279 --> 00:42:34,182

And so your posterior belief is about 9 to

1, which is roughly 90%.

620

00:42:34,182 --> 00:42:38,505

So quite likely that test is correct,

621

00:42:39,518 --> 00:42:41,198

in this example, have COVID.

622

00:42:42,019 --> 00:42:48,301

But the flip side is, let's say you're in

New Zealand, which has a very low rate of

623

00:42:48,301 --> 00:42:49,642

COVID infection.

624

00:42:50,062 --> 00:42:51,162

You haven't been exposed.

625

00:42:51,162 --> 00:42:55,424

You've been working from home for a week,

and you have no symptoms at all.

626

00:42:55,424 --> 00:42:57,305

You feel totally fine.

627

00:42:58,646 --> 00:42:59,906

What's your base rate there?

628

00:42:59,906 --> 00:43:03,287

What's the probability that you

miraculously have COVID?

629

00:43:03,768 --> 00:43:06,509

1 in 1,000 at most, probably lower.

630

00:43:07,149 --> 00:43:08,269

And so if you.

631

00:43:08,406 --> 00:43:13,927

took a test and it came back positive,

it's still probably only about one in a

632

00:43:13,927 --> 00:43:21,470

hundred that you actually have COVID and a

99% chance that that's a false positive.

633

00:43:21,470 --> 00:43:23,410

So that's, you know, as I said, that's the

usual example.

634

00:43:23,410 --> 00:43:28,612

It's probably familiar, but it's a case

where if you neglect the prior, if you

635

00:43:28,612 --> 00:43:33,833

neglect the base rate, you can be not just

a little bit wrong, but wrong by orders of

636

00:43:33,833 --> 00:43:34,853

magnitude.

637

00:43:36,798 --> 00:43:38,318

Yeah, exactly.

638

00:43:38,878 --> 00:43:45,820

And it is a classical example for us in

the stats world, but I think it's very

639

00:43:45,820 --> 00:43:51,081

effective for non-stats people because

that also talks to them.

640

00:43:52,022 --> 00:43:58,303

And it's also the gut reaction to a

positive test is so geared towards

641

00:43:58,303 --> 00:44:06,325

thinking you do have the disease that I

think that that's also why

642

00:44:06,866 --> 00:44:07,906

It's a good one.

643

00:44:10,247 --> 00:44:15,650

Another paradox you're talking about in

the book is the Overton paradox.

644

00:44:17,171 --> 00:44:20,413

Could you share some insights into this

one?

645

00:44:20,533 --> 00:44:26,276

I don't think I know that one and how

Bayesian analysis plays a role in

646

00:44:26,416 --> 00:44:29,097

understanding it, if any.

647

00:44:29,097 --> 00:44:29,918

Sure.

648

00:44:29,918 --> 00:44:33,500

Well, you may not have heard of the

Overton paradox, and that's because I made

649

00:44:33,500 --> 00:44:34,620

the name up.

650

00:44:36,790 --> 00:44:38,771

We'll see, I don't know if it will stick.

651

00:44:39,051 --> 00:44:42,614

One of the things I'm a little bit afraid

of is it's possible that this is something

652

00:44:42,614 --> 00:44:46,696

that has been studied and is well known

and I just haven't found it in the

653

00:44:46,696 --> 00:44:47,837

literature.

654

00:44:48,217 --> 00:44:53,341

I've done my best and I've asked a number

of people, but I think it's a thing that

655

00:44:53,341 --> 00:44:55,262

has not been given a name.

656

00:44:55,282 --> 00:44:59,125

So maybe I've given it a name, but we'll

find out.

657

00:44:59,345 --> 00:45:00,606

But that's not important.

658

00:45:00,606 --> 00:45:04,428

The important part is I think it answers

an interesting question.

659

00:45:05,009 --> 00:45:06,089

And this is

660

00:45:06,290 --> 00:45:13,693

If you compare older people and younger

people in terms of their political

661

00:45:13,693 --> 00:45:18,115

beliefs, you will find in general that

older people are more conservative.

662

00:45:18,475 --> 00:45:23,097

So younger people, more liberal, older

people are more conservative.

663

00:45:23,277 --> 00:45:31,101

And if you follow people over time and you

ask them, are you liberal or conservative,

664

00:45:31,581 --> 00:45:32,921

it crosses over.

665

00:45:33,194 --> 00:45:37,056

When people are roughly 25 years old, they

are more likely to say liberal.

666

00:45:37,296 --> 00:45:41,439

By the time they're 35 or 40, they are

more likely to say conservative.

667

00:45:41,780 --> 00:45:43,101

So we have two patterns here.

668

00:45:43,101 --> 00:45:47,944

We have older people actually hold more

conservative beliefs.

669

00:45:48,344 --> 00:45:53,908

And as people get older, they are more

likely to say that they are conservative.

670

00:45:54,829 --> 00:46:00,413

Nevertheless, if you follow people over

time, their beliefs become more liberal.

671

00:46:02,386 --> 00:46:03,546

So that's the paradox.

672

00:46:03,546 --> 00:46:07,048

By and large, people don't change their

beliefs a lot over the course of their

673

00:46:07,048 --> 00:46:07,868

lives.

674

00:46:07,949 --> 00:46:11,170

Excuse me.

675

00:46:11,170 --> 00:46:13,932

But when they do, they become a little bit

more liberal.

676

00:46:13,952 --> 00:46:18,494

But nevertheless, they are more likely to

say that they are conservative.

677

00:46:19,275 --> 00:46:20,716

So that's the paradox.

678

00:46:20,716 --> 00:46:22,157

And let me put it to you.

679

00:46:22,157 --> 00:46:23,237

Do you know why?

680

00:46:31,122 --> 00:46:41,446

I've heard about the two in isolation, but

I don't think I've heard them linked that

681

00:46:41,446 --> 00:46:42,226

way.

682

00:46:43,447 --> 00:46:47,268

And no, for now, I don't have an intuitive

explanation to that.

683

00:46:47,268 --> 00:46:48,589

So I'm very curious.

684

00:46:49,429 --> 00:46:55,291

So here's my theory, and it is partly that

conservative and liberal are relative

685

00:46:55,291 --> 00:46:56,192

terms.

686

00:47:00,638 --> 00:47:08,002

I am to the right of where I perceive the

center of mass to be.

687

00:47:08,002 --> 00:47:10,644

And the center of mass is moving over

time.

688

00:47:11,084 --> 00:47:15,327

And that's the key, primarily because of

generational replacement.

689

00:47:15,667 --> 00:47:21,330

So as older people die and they are

replaced by younger people, the mean

690

00:47:21,731 --> 00:47:26,153

shifts toward liberal pretty consistently

over time.

691

00:47:26,418 --> 00:47:31,040

And it happens in all three groups among

people who identify themselves as

692

00:47:31,040 --> 00:47:33,240

conservative, liberal, or moderate.

693

00:47:33,801 --> 00:47:41,024

All three of those lines are moving almost

in parallel toward more liberal beliefs.

694

00:47:41,404 --> 00:47:48,427

And what that means is if you took a time

machine to 1970 and you collected the

695

00:47:48,427 --> 00:47:53,409

average liberal and you put them in a time

machine and you bring them to the year

696

00:47:53,409 --> 00:47:54,049

2000.

697

00:47:54,978 --> 00:47:59,399

they would be indistinguishable from a

moderate in the year 2000.

698

00:48:00,100 --> 00:48:03,901

And if you bring them all the way to the

present, they would be indistinguishable

699

00:48:03,901 --> 00:48:08,884

from a current conservative, which is a

strange thing to realize.

700

00:48:08,884 --> 00:48:13,306

If you have this mental image of people in

tie dye with peace medallions from the

701

00:48:13,306 --> 00:48:18,988

seventies being transported into the

present, they would be relatively

702

00:48:18,988 --> 00:48:22,329

conservative compared to current views.

703

00:48:23,498 --> 00:48:27,981

And that is almost that time traveler

example is almost exactly what happens to

704

00:48:27,981 --> 00:48:30,142

people over the course of their lives.

705

00:48:30,722 --> 00:48:36,206

That in their youth, they hold views that

are left of center.

706

00:48:37,367 --> 00:48:43,051

And their views change slowly over time,

but the center moves faster.

707

00:48:43,972 --> 00:48:46,993

And that's, I call it chasing the Overton

window.

708

00:48:48,174 --> 00:48:52,177

The Overton window, I should explain where

that term comes from, is in political

709

00:48:52,177 --> 00:48:52,946

science,

710

00:48:52,946 --> 00:48:57,547

It is the set of ideas that are

politically acceptable at any point in

711

00:48:57,547 --> 00:48:58,327

time.

712

00:48:58,487 --> 00:49:02,468

And it shifts over time, which is

something that might have been radical in

713

00:49:02,468 --> 00:49:05,369

the 1970s, might be mainstream now.

714

00:49:05,869 --> 00:49:10,730

And there are a number of views from the

seventies that were pretty mainstream.

715

00:49:10,890 --> 00:49:13,491

Like a large fraction.

716

00:49:13,491 --> 00:49:16,012

I don't think it was a majority, but I

forget the number.

717

00:49:16,012 --> 00:49:20,813

It might, might've been 30% of people in

the 1970s thought that mixed race

718

00:49:20,813 --> 00:49:22,973

marriages should be illegal.

719

00:49:23,186 --> 00:49:24,106

Yeah.

720

00:49:24,106 --> 00:49:27,409

That wasn't the majority view, but it was

mainstream.

721

00:49:28,170 --> 00:49:30,872

And now that's pretty out there.

722

00:49:30,872 --> 00:49:35,836

That's a pretty small minority still hold

that view and it's considered extreme.

723

00:49:36,077 --> 00:49:39,299

Yeah, and it changed quite, quite fast.

724

00:49:39,680 --> 00:49:40,380

Yes.

725

00:49:40,601 --> 00:49:48,267

Also, like, the acceptability of same sex

marriage really changed very fast.

726

00:49:48,327 --> 00:49:49,828

If you look in it, you know,

727

00:49:51,682 --> 00:49:53,382

time series perspective.

728

00:49:53,422 --> 00:49:59,423

That's also a very interesting thing that

these opinions can change very fast.

729

00:49:59,644 --> 00:50:01,284

So yeah, okay.

730

00:50:01,364 --> 00:50:02,244

I understand.

731

00:50:02,244 --> 00:50:08,946

It's kind of like how you define liberal

and conservative in a way explains that

732

00:50:08,946 --> 00:50:09,946

paradox.

733

00:50:11,607 --> 00:50:13,487

Very interesting.

734

00:50:13,607 --> 00:50:18,569

This is a little speculative, but that's

something that might have accelerated

735

00:50:19,329 --> 00:50:20,909

since the 1990s.

736

00:50:21,482 --> 00:50:26,624

that in many of the trends that I saw

between 1970 and 1990, they were

737

00:50:26,624 --> 00:50:30,425

relatively slow and they were being driven

by generational replacement.

738

00:50:30,766 --> 00:50:33,247

By and large, people were not changing

their minds.

739

00:50:33,247 --> 00:50:36,508

It's just that people would die and be

replaced.

740

00:50:37,769 --> 00:50:41,890

There's a line from the sciences that says

that the sciences progress one funeral at

741

00:50:41,890 --> 00:50:43,311

a time.

742

00:50:43,311 --> 00:50:44,931

Just a little morbid.

743

00:50:45,692 --> 00:50:48,873

But that is in some sense the baseline

rate.

744

00:50:49,866 --> 00:50:52,127

societal change and it's relatively slow.

745

00:50:52,627 --> 00:50:53,568

It's about 1% a year.

746

00:50:53,568 --> 00:50:54,268

Yeah.

747

00:50:55,869 --> 00:50:59,731

In the starting the 1990s, and

particularly you mentioned support for

748

00:50:59,731 --> 00:51:04,914

same sex marriage, also just general

acceptance of homosexuality changed

749

00:51:04,914 --> 00:51:05,914

radically.

750

00:51:05,914 --> 00:51:11,277

In in 1990, it was about 75% of the US

population would have said that

751

00:51:11,277 --> 00:51:12,998

homosexuality was wrong.

752

00:51:12,998 --> 00:51:15,540

That was one of the questions in the

general social survey.

753

00:51:15,540 --> 00:51:16,740

Do you think it's wrong?

754

00:51:16,740 --> 00:51:17,441

75%?

755

00:51:17,441 --> 00:51:19,161

That's

756

00:51:19,334 --> 00:51:21,755

I think below 30 now.

757

00:51:21,755 --> 00:51:26,236

So between 1990 and now, let's say roughly

40 years, it changed by about 40

758

00:51:26,236 --> 00:51:27,477

percentage points.

759

00:51:28,217 --> 00:51:33,019

So that's about the speed of light in

terms of societal change.

760

00:51:33,419 --> 00:51:37,141

And one of the things that I did in the

book was to try to break that down into

761

00:51:37,141 --> 00:51:40,262

how much of that is generational

replacement and how much of that is people

762

00:51:40,262 --> 00:51:41,903

actually changing their minds.

763

00:51:42,823 --> 00:51:49,065

And that was an example where I think 80%

of the change was changed minds.

764

00:51:49,598 --> 00:51:51,698

not just one funeral at a time.

765

00:51:52,778 --> 00:51:56,199

So that's something that might be

different now.

766

00:51:56,199 --> 00:51:58,860

And one obvious culprit is the internet.

767

00:51:58,860 --> 00:52:04,442

So we'll see.

768

00:52:04,442 --> 00:52:04,542

Yeah.

769

00:52:04,542 --> 00:52:09,103

And another proof that the internet is

neither good nor bad, right?

770

00:52:09,103 --> 00:52:12,644

It's just a tool, and it depends on what

we're doing with it.

771

00:52:12,644 --> 00:52:16,225

The internet is helping us right now

having that conversation and me having

772

00:52:16,225 --> 00:52:18,125

that podcast for four years.

773

00:52:18,145 --> 00:52:19,585

Otherwise, that would have been.

774

00:52:19,606 --> 00:52:20,846

virtually impossible.

775

00:52:20,846 --> 00:52:25,449

So yeah, really depends on what you're

doing with it.

776

00:52:26,810 --> 00:52:34,236

And another topic, I mean, I don't think I

don't remember it being in the book, but I

777

00:52:34,236 --> 00:52:39,679

think you mentioned it in one of your blog

posts, is the idea of a Bajan killer app.

778

00:52:40,020 --> 00:52:41,681

So I have to ask you about that.

779

00:52:41,681 --> 00:52:46,584

Why is it important in the context of

decision making and statistics?

780

00:52:48,990 --> 00:52:52,490

I think a perpetual question, which is,

you know, if Bayesian methods are so

781

00:52:52,490 --> 00:52:54,751

great, why are they not taking off?

782

00:52:54,751 --> 00:52:57,932

Why is not everybody is using them?

783

00:52:58,132 --> 00:53:03,153

And I think one of the problems is that

when people do the comparison of Bayesian

784

00:53:03,153 --> 00:53:07,775

and frequentism, and they have tried out

the usual debates, they often show an

785

00:53:07,775 --> 00:53:13,436

example where you do the frequentist

analysis and you get a point estimate.

786

00:53:13,696 --> 00:53:18,757

And then you do the Bayesian analysis and

you generate a point estimate.

787

00:53:19,070 --> 00:53:21,812

And sometimes it's the same or roughly the

same.

788

00:53:21,812 --> 00:53:26,336

And so people sort of shrug and say, well,

you know, what's the big deal?

789

00:53:26,336 --> 00:53:30,580

The problem there is that when you do the

Bayesian analysis, the result is a

790

00:53:30,580 --> 00:53:36,945

posterior distribution that contains all

of the information that you have about

791

00:53:36,945 --> 00:53:39,788

whatever it was that you were trying to

estimate.

792

00:53:39,788 --> 00:53:44,331

And if you boil it down to a point

estimate, you've discarded all the useful

793

00:53:44,331 --> 00:53:44,992

information.

794

00:53:44,992 --> 00:53:45,532

So.

795

00:53:47,538 --> 00:53:51,440

If all you do is compare point estimates,

you're really missing the point.

796

00:53:52,361 --> 00:53:57,966

And that's where I was thinking about what

is the killer app that really shows the

797

00:53:57,966 --> 00:54:02,249

difference between Bayesian methods and

the alternatives.

798

00:54:02,249 --> 00:54:08,674

And my favorite example is the Bayesian

bandit strategy or Thompson sampling,

799

00:54:08,795 --> 00:54:15,340

which is an application to anything that's

like A-B testing or running a medical test

800

00:54:15,340 --> 00:54:17,501

where you're comparing two different

treatments.

801

00:54:18,218 --> 00:54:24,884

you are always making a decision about

which thing to try next, A or B or one

802

00:54:24,884 --> 00:54:29,988

treatment or the other, and then when you

see the result you're updating your

803

00:54:29,988 --> 00:54:30,768

beliefs.

804

00:54:30,768 --> 00:54:36,292

So you're constantly collecting data and

using that data to make decisions.

805

00:54:36,353 --> 00:54:41,817

And that's where I think the Bayesian

methods show what they're really good for,

806

00:54:41,817 --> 00:54:45,079

because if you are making decisions and

those decisions

807

00:54:47,266 --> 00:54:52,848

the whole posterior distribution because

most of the time you're doing some kind of

808

00:54:52,848 --> 00:54:54,088

optimization.

809

00:54:54,108 --> 00:54:59,891

You are integrating over the posterior or

in discrete world, you're just looping

810

00:54:59,891 --> 00:55:05,573

over the posterior and for every possible

outcome, figuring out the cost or the

811

00:55:05,573 --> 00:55:09,554

benefit and weighting it by its posterior

probability.

812

00:55:09,695 --> 00:55:12,856

That's where you get the real benefit.

813

00:55:12,856 --> 00:55:14,536

And so, Thompson

814

00:55:16,838 --> 00:55:22,839

end-to-end application where people

understand the problem and where the

815

00:55:22,839 --> 00:55:26,240

solution is a remarkably elegant and

simple one.

816

00:55:26,580 --> 00:55:32,802

And you can point to the outcome and say,

this is an optimal balance of exploitation

817

00:55:32,802 --> 00:55:33,942

and exploration.

818

00:55:33,942 --> 00:55:38,764

You are always making the best decision

based on the information that you have at

819

00:55:38,764 --> 00:55:40,104

that point in time.

820

00:55:40,144 --> 00:55:40,944

Yeah.

821

00:55:40,944 --> 00:55:43,685

Yeah, I see what you're saying.

822

00:55:43,785 --> 00:55:45,105

And I...

823

00:55:45,402 --> 00:55:51,125

In a way, it's a bit of a shame that it's

the simplest application because it's not

824

00:55:51,125 --> 00:55:52,166

that simple.

825

00:55:54,268 --> 00:55:57,450

But yeah, I agree with that example.

826

00:55:57,450 --> 00:56:06,176

And for people, I put this blog post where

you talk about that patient care app in

827

00:56:06,176 --> 00:56:13,240

the show notes because yeah, it's not

super easy,

828

00:56:14,922 --> 00:56:20,206

I think it's way better in a written

format, or at least a video.

829

00:56:20,566 --> 00:56:25,831

But yeah, definitely these kind of

situations in a way where you have lots of

830

00:56:25,831 --> 00:56:30,534

uncertainty and you really care about

updating your belief as accurately as

831

00:56:30,534 --> 00:56:33,516

possible, which happens a lot.

832

00:56:34,538 --> 00:56:39,361

But yeah, in this case also, I think it's

extremely valuable.

833

00:56:44,714 --> 00:56:46,394

But I think it can be.

834

00:56:46,754 --> 00:56:51,675

Because first of all, I think if you do it

using conjugate priors, then the update

835

00:56:51,675 --> 00:56:53,076

step is trivial.

836

00:56:53,076 --> 00:56:55,556

You're just updating beta distributions.

837

00:56:55,777 --> 00:57:02,879

And every time a new data comes in, a new

datum, you're just adding one to one of

838

00:57:02,879 --> 00:57:03,499

your parameters.

839

00:57:03,499 --> 00:57:08,640

So the computational work is the increment

operator, which is not too bad.

840

00:57:08,640 --> 00:57:12,461

But I've also done a version of Thompson

sampling as a dice game.

841

00:57:13,202 --> 00:57:15,323

I want to take this opportunity to point

people to it.

842

00:57:15,323 --> 00:57:19,546

I gave you the link, so I hope it'll be in

the notes.

843

00:57:19,546 --> 00:57:21,808

But the game is called The Shakes.

844

00:57:22,429 --> 00:57:26,352

And I've got it up on a GitHub repository.

845

00:57:26,352 --> 00:57:29,414

But you can do Thompson sampling just by

rolling dice.

846

00:57:29,414 --> 00:57:29,894

Yeah.

847

00:57:29,894 --> 00:57:32,596

So we'll definitely put that in the show

notes.

848

00:57:33,397 --> 00:57:39,741

And also to come back to something you

said just a bit earlier.

849

00:57:40,542 --> 00:57:40,962

For sure.

850

00:57:40,962 --> 00:57:45,946

Then also something that puzzles me is

when people have a really good patient

851

00:57:45,946 --> 00:57:48,468

model, it's awesome.

852

00:57:48,748 --> 00:57:54,693

It's a good representation of the

underlying data generating process.

853

00:57:55,634 --> 00:57:57,255

It's complex enough, but not too much.

854

00:57:57,255 --> 00:57:58,476

It samples well.

855

00:57:58,476 --> 00:58:02,819

And then they do decision making based on

the mean of the posterior estimates.

856

00:58:03,220 --> 00:58:06,622

And I'm like, no, that's a shame.

857

00:58:07,123 --> 00:58:10,245

Why are you doing that past the whole

distribution?

858

00:58:10,346 --> 00:58:15,007

to your optimizer so that you can make

decisions based on the full uncertainty of

859

00:58:15,007 --> 00:58:19,669

the model and not just take the most

probable outcome.

860

00:58:20,290 --> 00:58:23,071

Because first, maybe that's not really

what you care about.

861

00:58:23,071 --> 00:58:26,392

And also, by definition, it's going to

sample your decision.

862

00:58:26,392 --> 00:58:27,993

It's going to bias your decision.

863

00:58:27,993 --> 00:58:32,094

So yeah, that always kind of breaks my

heart.

864

00:58:32,535 --> 00:58:35,736

But you've worked so well to get that.

865

00:58:35,736 --> 00:58:38,237

It's so hard to get those posterior

distributions.

866

00:58:38,237 --> 00:58:39,186

And now you're just.

867

00:58:39,186 --> 00:58:40,306

throwing everything away.

868

00:58:40,306 --> 00:58:42,228

That's a shame.

869

00:58:42,228 --> 00:58:42,869

Yeah.

870

00:58:42,869 --> 00:58:45,811

Do patient decision making, folks.

871

00:58:45,811 --> 00:58:48,233

You're losing all that information.

872

00:58:48,433 --> 00:58:55,659

And especially in any case where you've

got very nonlinear costs, nonlinear in the

873

00:58:55,659 --> 00:58:59,442

size of the error, and especially if it's

asymmetric.

874

00:58:59,602 --> 00:59:03,746

Thinking about almost anything that you

build, you always have a trade off between

875

00:59:03,746 --> 00:59:05,447

under building and over building.

876

00:59:06,508 --> 00:59:08,869

Over building is bad because it's

expensive.

877

00:59:08,998 --> 00:59:12,358

And underbuilding is bad because it will

fail catastrophically.

878

00:59:12,619 --> 00:59:17,900

So that's a case where you have very

nonlinear costs and very asymmetric.

879

00:59:18,040 --> 00:59:22,862

If you have the whole distribution, you

can take into account what's the

880

00:59:22,862 --> 00:59:28,343

probability of extreme catastrophic

effects, where the tail of that

881

00:59:28,343 --> 00:59:31,624

distribution is really important to

potential outcomes.

882

00:59:31,624 --> 00:59:34,125

Yeah, definitely.

883

00:59:34,125 --> 00:59:36,105

And.

884

00:59:36,586 --> 00:59:40,067

What I mean, I could continue, but we're

getting short on time and I still have a

885

00:59:40,067 --> 00:59:41,248

lot of things to ask you.

886

00:59:41,248 --> 00:59:43,469

So let's move on.

887

00:59:43,469 --> 00:59:48,412

And actually, I think you mentioned it a

bit at the beginning of your answer to my

888

00:59:48,412 --> 00:59:49,492

last question.

889

00:59:49,973 --> 00:59:53,955

But in another of your blog posts, you

addressed the claim that patient

890

00:59:53,955 --> 00:59:57,037

infrequentist methods often yield the same

results.

891

00:59:57,497 --> 01:00:00,179

And so I know you like to talk about that.

892

01:00:00,179 --> 01:00:04,641

So could you elaborate on this and why

you're saying it's a false claim?

893

01:00:06,162 --> 01:00:11,227

Yeah, as I mentioned this earlier, you

know, frequentist methods produce a point

894

01:00:11,227 --> 01:00:14,029

estimate and a confidence interval.

895

01:00:14,470 --> 01:00:17,954

And Bayesian methods produce a posterior

distribution.

896

01:00:17,954 --> 01:00:20,516

So they are different kinds of things.

897

01:00:20,516 --> 01:00:22,378

They cannot be the same.

898

01:00:22,579 --> 01:00:27,564

And I think Bayesians sometimes say this

as a way of being conciliatory that, you

899

01:00:27,564 --> 01:00:29,665

know, we're trying to let's all get along.

900

01:00:29,970 --> 01:00:33,333

And often, frequentist and Bayesian

methods are compatible.

901

01:00:33,333 --> 01:00:34,473

So that's good.

902

01:00:34,473 --> 01:00:36,395

The Bayesian methods aren't scary.

903

01:00:37,096 --> 01:00:42,020

I think strategically that might be a

mistake, because you're conceding the

904

01:00:42,020 --> 01:00:44,621

thing that makes Bayesian methods better.

905

01:00:44,962 --> 01:00:49,766

It's the posterior distribution that is

useful for all the reasons that we just

906

01:00:49,766 --> 01:00:50,387

said.

907

01:00:50,387 --> 01:00:53,209

So it is never the same.

908

01:00:53,289 --> 01:00:57,973

It is sometimes the case that if you take

the posterior distribution and you

909

01:00:57,973 --> 01:00:59,018

summarize it,

910

01:00:59,018 --> 01:01:04,499

with a point estimate or an interval, that

yes, sometimes those are the same as the

911

01:01:04,499 --> 01:01:05,799

frequentist methods.

912

01:01:05,799 --> 01:01:13,402

But the analogy that I use is, if you are

comparing a car and an airplane, but the

913

01:01:13,402 --> 01:01:18,483

rule is that the airplane has to stay on

the ground, then you would come away and

914

01:01:18,483 --> 01:01:25,925

you would think, wow, that airplane is a

complicated, expensive, inefficient way to

915

01:01:25,925 --> 01:01:27,305

drive on the highway.

916

01:01:28,746 --> 01:01:29,667

And you're right.

917

01:01:29,667 --> 01:01:33,590

If you want to drive on the highway, an

airplane is a terrible idea.

918

01:01:33,691 --> 01:01:37,294

The whole point of an airplane is that it

flies.

919

01:01:37,294 --> 01:01:41,777

If you don't fly the plane, you are not

getting the benefit of an airplane.

920

01:01:42,338 --> 01:01:43,859

That is a good point.

921

01:01:44,279 --> 01:01:47,622

And same, if you are not using the

posterior distribution, you are not

922

01:01:47,622 --> 01:01:55,268

getting the benefit of doing Bayesian

analysis.

923

01:01:55,268 --> 01:01:55,689

Yeah.

924

01:01:55,689 --> 01:01:56,669

Yeah, exactly.

925

01:01:57,646 --> 01:02:02,608

drive airplanes on the highway hurt you

well.

926

01:02:02,608 --> 01:02:10,051

Actually, a really good question is that

you can really see, and I think I do, and

927

01:02:10,051 --> 01:02:17,675

I'm probably sure you do in the work, you

do see many practitioners that might be

928

01:02:17,675 --> 01:02:23,837

hesitant to adopt patient methods due to

some perceived complexity most of the

929

01:02:23,837 --> 01:02:24,557

time.

930

01:02:24,886 --> 01:02:30,213

So I wonder in general, what resources or

strategies you recommend to those who want

931

01:02:30,213 --> 01:02:32,956

to learn and apply patient techniques in

their work.

932

01:02:36,030 --> 01:02:41,371

Yeah, I think Bayesian methods get the

reputation for complexity, I think largely

933

01:02:41,371 --> 01:02:42,911

because of MCMC.

934

01:02:42,911 --> 01:02:48,993

That if that's your first exposure, that's

scary and complicated.

935

01:02:49,173 --> 01:02:54,515

Or if you do it mathematically and you

start with big scary integrals, I think

936

01:02:54,515 --> 01:02:57,655

that also makes it seem more complex than

it needs to be.

937

01:02:57,655 --> 01:02:59,136

I think there are a couple of

alternatives.

938

01:02:59,136 --> 01:03:04,497

And the one that I use in think Bayes is

everything is discrete and everything is

939

01:03:04,497 --> 01:03:05,697

computational.

940

01:03:05,962 --> 01:03:11,023

So all of those integrals become for loops

or just array operations.

941

01:03:11,943 --> 01:03:13,924

And I think that helps a lot.

942

01:03:13,924 --> 01:03:15,644

So those are using grid algorithms.

943

01:03:15,644 --> 01:03:22,006

I think grid algorithms can get you a

really long way with very little tooling,

944

01:03:22,006 --> 01:03:24,006

basically arrays.

945

01:03:25,407 --> 01:03:29,848

You lay out a grid, you compute a prior,

you compute a likelihood, you do a

946

01:03:29,848 --> 01:03:34,149

multiplication, which is usually just an

array multiplication, and you normalize,

947

01:03:34,149 --> 01:03:35,689

divide through by the total.

948

01:03:36,118 --> 01:03:36,598

That's it.

949

01:03:36,598 --> 01:03:38,018

That's a Bayesian update.

950

01:03:38,639 --> 01:03:40,679

So I think that's one approach.

951

01:03:40,879 --> 01:03:46,082

The other one, I would consider an

introductory stats class that does

952

01:03:46,082 --> 01:03:50,203

everything using Bayesian methods, using

conjugate priors.

953

01:03:50,423 --> 01:03:51,604

And don't derive anything.

954

01:03:51,604 --> 01:03:58,266

Don't compute why the beta binomial model

works.

955

01:03:58,367 --> 01:04:04,109

But if you just take it as given, that

when you are estimating a proportion, you

956

01:04:04,109 --> 01:04:05,509

run a bunch of trials.

957

01:04:05,810 --> 01:04:08,951

and you'll have some number of successes

and some number of failures.

958

01:04:08,951 --> 01:04:12,633

Let's call it A and B.

959

01:04:12,633 --> 01:04:17,595

You build a beta distribution that has the

parameters A plus one, B plus one.

960

01:04:18,236 --> 01:04:18,836

That's it.

961

01:04:18,836 --> 01:04:20,296

That's your posterior.

962

01:04:20,557 --> 01:04:26,459

And now you can take that posterior beta

distribution and answer all the questions.

963

01:04:26,800 --> 01:04:27,960

What's the mean?

964

01:04:27,980 --> 01:04:30,541

What's a confidence or credible interval?

965

01:04:31,018 --> 01:04:33,278

But more importantly, like what are the

tail probabilities?

966

01:04:33,278 --> 01:04:36,579

What's the probability that I could exceed

some critical value?

967

01:04:37,259 --> 01:04:43,381

Or, again, loop over that posterior and

answer interesting questions with it.

968

01:04:44,141 --> 01:04:48,942

You could do all of that on the first day

of a statistics class.

969

01:04:49,443 --> 01:04:52,823

And use a computer, because we can

compute.

970

01:04:52,984 --> 01:04:57,385

SciPy.stats.beta will tell you everything

you want to know about a beta

971

01:04:57,385 --> 01:04:58,385

distribution.

972

01:04:59,946 --> 01:05:02,807

of a stats class, that's estimating

proportions.

973

01:05:02,807 --> 01:05:05,988

It's everything you need to do.

974

01:05:05,988 --> 01:05:07,889

And it handles all of the weird cases.

975

01:05:07,889 --> 01:05:12,711

Like if you want to estimate a very small

probability, it's okay.

976

01:05:12,751 --> 01:05:14,392

You can still get a confidence interval.

977

01:05:14,392 --> 01:05:16,452

It's all perfectly well behaved.

978

01:05:16,733 --> 01:05:20,334

If you have an informative prior, sure, no

problem.

979

01:05:20,334 --> 01:05:26,297

Just start with some pre-counts in your

beta distribution.

980

01:05:26,297 --> 01:05:28,777

So day one, estimating proportions.

981

01:05:28,798 --> 01:05:30,699

Day two, estimate rates.

982

01:05:30,699 --> 01:05:35,202

You could do exactly the same thing with a

Poisson gamma model.

983

01:05:35,522 --> 01:05:37,603

And the update is just as trivial.

984

01:05:38,664 --> 01:05:42,907

And you could talk about Poisson

distributions and exponential

985

01:05:42,907 --> 01:05:45,008

distributions and estimating rates.

986

01:05:45,229 --> 01:05:51,313

My favorite example is I always use either

soccer, football, or hockey as my example

987

01:05:51,313 --> 01:05:54,055

of goal scoring rates.

988

01:05:54,055 --> 01:05:55,215

And you can generate predictions.

989

01:05:55,215 --> 01:05:58,817

You can say, what are the likely outcomes

of the next game?

990

01:05:58,922 --> 01:06:03,183

What's the chance that I'm going to win,

let's say, it's a best of seven series.

991

01:06:03,203 --> 01:06:06,183

The update is computationally nothing.

992

01:06:06,484 --> 01:06:07,264

Yeah.

993

01:06:07,264 --> 01:06:11,365

And you can answer all the interesting

questions about rates.

994

01:06:11,365 --> 01:06:12,725

So that's day two.

995

01:06:13,205 --> 01:06:17,947

I don't know what to do with the rest of

the semester because we've just done 90%

996

01:06:17,947 --> 01:06:19,907

of an intro stats class.

997

01:06:19,967 --> 01:06:20,747

Yes.

998

01:06:21,788 --> 01:06:27,089

Yeah, that sounds like something I think

that would work in the sense that at least

999

01:06:27,089 --> 01:06:28,589

that was my experience.

Speaker:

01:06:29,130 --> 01:06:34,332

Funny story, I used to not like stats,

which is funny when you see what I'm doing

Speaker:

01:06:34,332 --> 01:06:35,052

today.

Speaker:

01:06:35,052 --> 01:06:40,054

But when I was in university, I did a lot

of math.

Speaker:

01:06:40,194 --> 01:06:43,995

And the thing is, the stats we were doing

with was pen and paper.

Speaker:

01:06:44,076 --> 01:06:46,216

So it was incredibly boring.

Speaker:

01:06:46,216 --> 01:06:51,939

I was always, you know, dice problems and

very trivial stuff that you have to do

Speaker:

01:06:51,939 --> 01:06:55,920

that because the human brain is not good

at computing that kind of stuff, you know.

Speaker:

01:06:59,118 --> 01:07:05,682

did when I started having to use

statistics to do electoral forecasting.

Speaker:

01:07:05,843 --> 01:07:06,984

I was like, but this is awesome.

Speaker:

01:07:06,984 --> 01:07:09,246

Like I can just simulate the distribution.

Speaker:

01:07:09,246 --> 01:07:12,028

I can see them on the screen.

Speaker:

01:07:12,028 --> 01:07:14,630

I can really almost touch them.

Speaker:

01:07:14,630 --> 01:07:20,455

You know, and that was much more concrete

first and also much more empowering

Speaker:

01:07:20,455 --> 01:07:26,800

because I could work on topics that were

not trivial stuff that I only would use

Speaker:

01:07:26,800 --> 01:07:28,021

for board games.

Speaker:

01:07:28,021 --> 01:07:28,581

You know?

Speaker:

01:07:28,581 --> 01:07:29,201

So.

Speaker:

01:07:30,234 --> 01:07:32,955

I think it's a very powerful way of

teaching for sure.

Speaker:

01:07:34,975 --> 01:07:43,338

So to play us out, I'd like to zoom out a

bit and ask you what you hope readers will

Speaker:

01:07:43,338 --> 01:07:49,380

take away from probably overthinking it

and how can the insights from your book be

Speaker:

01:07:49,380 --> 01:07:53,001

applied to improve decision making in

various fields?

Speaker:

01:07:53,001 --> 01:07:53,361

Yeah.

Speaker:

01:07:53,361 --> 01:07:54,402

Well, I think I'll...

Speaker:

01:07:54,402 --> 01:07:59,863

come back to where we started, which is it

is about using data to answer questions,

Speaker:

01:08:00,243 --> 01:08:01,843

make better decisions.

Speaker:

01:08:02,104 --> 01:08:08,465

And my thesis again is that we are better

off when we use evidence and reason than

Speaker:

01:08:08,465 --> 01:08:09,466

when we don't.

Speaker:

01:08:09,466 --> 01:08:11,066

So I hope it's empowering.

Speaker:

01:08:11,066 --> 01:08:17,408

I hope people come away from it thinking

that you don't need graduate degrees in

Speaker:

01:08:17,408 --> 01:08:23,146

statistics to work with data to interpret

the results that you're seeing in

Speaker:

01:08:23,146 --> 01:08:29,730

research papers, in newspapers, that it

can be straightforward.

Speaker:

01:08:30,051 --> 01:08:33,453

And then occasionally there are some

surprises that you need to know about.

Speaker:

01:08:35,210 --> 01:08:38,331

Yeah.

Speaker:

01:08:38,331 --> 01:08:38,731

For sure.

Speaker:

01:08:38,731 --> 01:08:45,214

Personally, have you changed some of the

ways you're making decisions based on your

Speaker:

01:08:45,214 --> 01:08:46,514

work for this book, Kéján?

Speaker:

01:08:48,735 --> 01:08:49,255

Maybe.

Speaker:

01:08:49,255 --> 01:08:56,578

I think a lot of the examples in the book

come from me thinking about something in

Speaker:

01:08:56,578 --> 01:08:57,619

real life.

Speaker:

01:08:58,779 --> 01:09:04,462

There's one example where when I was

running a relay race, I noticed that

Speaker:

01:09:05,182 --> 01:09:09,843

everybody was either much slower than me

or much faster than me.

Speaker:

01:09:09,843 --> 01:09:13,984

And it seemed like there was nobody else

in the race who was running at my speed.

Speaker:

01:09:15,284 --> 01:09:19,426

And that's the kind of thing where when

you're running and you're oxygen deprived,

Speaker:

01:09:19,426 --> 01:09:21,146

it seems really confusing.

Speaker:

01:09:21,526 --> 01:09:24,907

And then with a little bit of reflection,

you realize, well, there's some

Speaker:

01:09:24,907 --> 01:09:29,788

statistical bias there, which is, if

someone is running the same speed as me,

Speaker:

01:09:29,788 --> 01:09:31,769

I'm unlikely to see them.

Speaker:

01:09:33,249 --> 01:09:33,718

Yeah.

Speaker:

01:09:33,718 --> 01:09:38,759

But if they are much faster or much

slower, then I'm going to overtake them or

Speaker:

01:09:38,759 --> 01:09:40,439

they're going to overtake me.

Speaker:

01:09:40,439 --> 01:09:42,200

Yeah, exactly.

Speaker:

01:09:42,200 --> 01:09:46,321

And that makes me think about an

absolutely awesome joke from, of course, I

Speaker:

01:09:46,321 --> 01:09:54,523

don't remember the name of the comedian,

but very, very well-known US comedian that

Speaker:

01:09:54,523 --> 01:09:55,103

you may know.

Speaker:

01:09:55,103 --> 01:10:00,865

And the joke was, have you ever noticed

that everybody that drives slower than you

Speaker:

01:10:00,865 --> 01:10:02,814

on the road is a jackass?

Speaker:

01:10:02,814 --> 01:10:08,536

and everybody that drives faster than you

is a moron.

Speaker:

01:10:08,576 --> 01:10:10,837

It's really the same idea, right?

Speaker:

01:10:10,837 --> 01:10:16,539

It's like you have the right speed and

you're doing the right thing and everybody

Speaker:

01:10:16,539 --> 01:10:21,061

else is just either a moron or a jackass.

Speaker:

01:10:21,061 --> 01:10:21,902

That's exactly right.

Speaker:

01:10:21,902 --> 01:10:23,902

I believe that is George Carlin.

Speaker:

01:10:24,083 --> 01:10:26,604

This exactly George Carlin, yeah, yeah.

Speaker:

01:10:26,604 --> 01:10:30,165

And amazing, I mean, George Carlin is just

absolutely incredible.

Speaker:

01:10:30,165 --> 01:10:30,825

But...

Speaker:

01:10:30,846 --> 01:10:36,467

Yeah, that's what is already a very keen

observation of the human nature also, I

Speaker:

01:10:36,467 --> 01:10:39,128

think.

Speaker:

01:10:39,128 --> 01:10:48,031

Which is also an interesting joke in the

sense that it relates to one, you know,

Speaker:

01:10:48,031 --> 01:10:54,312

concepts of how minds change and how

people think about reality and so on.

Speaker:

01:10:56,293 --> 01:10:57,574

And I find it...

Speaker:

01:10:57,574 --> 01:10:58,514

I find it very interesting.

Speaker:

01:10:58,514 --> 01:11:01,616

So for people interested, I know we're

short on time, so I'm just going to

Speaker:

01:11:01,616 --> 01:11:07,699

mention there is an awesome book that's

called How Minds Change by David McCraney.

Speaker:

01:11:07,699 --> 01:11:09,300

I'll put that in the show notes.

Speaker:

01:11:09,300 --> 01:11:14,683

And he talks about these kind of topics

and that's especially interesting.

Speaker:

01:11:14,683 --> 01:11:19,306

And of course, patient statistics are

mentioned in the book because if you're

Speaker:

01:11:19,306 --> 01:11:24,669

interested in optimal decision making at

some point, you're going to talk about

Speaker:

01:11:24,669 --> 01:11:25,729

patient stats.

Speaker:

01:11:26,134 --> 01:11:27,114

But he's a journalist.

Speaker:

01:11:27,114 --> 01:11:31,198

Like he doesn't know at all about patient

stats originally.

Speaker:

01:11:31,198 --> 01:11:33,359

And then at some point, it just appears.

Speaker:

01:11:34,420 --> 01:11:35,801

I will check that out.

Speaker:

01:11:35,801 --> 01:11:38,343

Yeah, I'll put that into the show notes.

Speaker:

01:11:39,825 --> 01:11:44,808

So before asking you the last two

questions, Alan, I'm curious about your

Speaker:

01:11:45,850 --> 01:11:52,334

predictions, because we're all scientists

here, and we're interested in predictions.

Speaker:

01:11:53,576 --> 01:11:55,762

I wonder if you think there is a way

Speaker:

01:11:55,762 --> 01:12:01,832

In the realm of statistics education, are

there any innovative approaches or

Speaker:

01:12:01,832 --> 01:12:07,282

technologies that you believe have the

potential to change, transform how people

Speaker:

01:12:07,282 --> 01:12:09,965

learn and apply statistical concepts?

Speaker:

01:12:11,866 --> 01:12:16,268

Well, I think the things we've been

talking about, computation, simulation,

Speaker:

01:12:16,328 --> 01:12:23,232

and Bayesian methods, I think have the

best chance to really change statistics

Speaker:

01:12:23,232 --> 01:12:25,253

education.

Speaker:

01:12:25,253 --> 01:12:27,294

I'm not sure how it will happen.

Speaker:

01:12:27,774 --> 01:12:34,558

It doesn't look like statistics

departments are changing enough or fast

Speaker:

01:12:34,558 --> 01:12:35,358

enough.

Speaker:

01:12:35,779 --> 01:12:39,741

I think what's going to happen is that

data science departments are going to be

Speaker:

01:12:39,741 --> 01:12:40,901

created

Speaker:

01:12:41,126 --> 01:12:43,867

And I think that's where the innovation

will be.

Speaker:

01:12:44,748 --> 01:12:48,231

But I think the question is, what that

will mean?

Speaker:

01:12:48,231 --> 01:12:53,314

When you create a data science department,

is it going to be all machine learning and

Speaker:

01:12:53,735 --> 01:13:01,781

algorithms or statistical thinking and

basic using data for decision making, as

Speaker:

01:13:01,781 --> 01:13:03,562

I'm advocating for?

Speaker:

01:13:03,942 --> 01:13:06,004

So obviously, I hope it's the latter.

Speaker:

01:13:06,004 --> 01:13:08,425

I hope data science becomes.

Speaker:

01:13:08,926 --> 01:13:13,990

in some sense, what statistics should have

been and starts doing a better job of

Speaker:

01:13:13,990 --> 01:13:20,035

using, as I said, computation, simulation,

Bayesian thinking, and causal inference, I

Speaker:

01:13:20,035 --> 01:13:21,816

think is probably the other big one.

Speaker:

01:13:22,837 --> 01:13:23,698

Yeah.

Speaker:

01:13:23,698 --> 01:13:25,018

Yeah, exactly.

Speaker:

01:13:25,199 --> 01:13:30,363

And they really go hand in hand also, as

we were seeing at the very beginning of

Speaker:

01:13:30,363 --> 01:13:31,143

the show.

Speaker:

01:13:32,265 --> 01:13:38,329

Of course, I do hope that that's going to

be the case.

Speaker:

01:13:38,570 --> 01:13:40,831

You've already been very generous with

your time.

Speaker:

01:13:40,831 --> 01:13:46,394

So let's ask you the last two questions,

ask everyone at the end of the show.

Speaker:

01:13:46,394 --> 01:13:50,996

And you're in a very privileged position

because it's your second episode here.

Speaker:

01:13:50,996 --> 01:13:57,760

So you're in the position where you can

answer something else from your previous

Speaker:

01:13:57,760 --> 01:14:02,783

answers, which is a very privileged

position because usually the difficulty of

Speaker:

01:14:02,783 --> 01:14:08,325

these questions is that you have to choose

and you cannot answer all of it.

Speaker:

01:14:08,822 --> 01:14:12,148

you get to have a second round, Alain.

Speaker:

01:14:12,148 --> 01:14:18,361

So first, if you had unlimited time and

resources, which problem would you try to

Speaker:

01:14:18,361 --> 01:14:19,081

solve?

Speaker:

01:14:21,206 --> 01:14:30,069

I think the problem of the 21st century is

how do we get to 2100 with a habitable

Speaker:

01:14:30,069 --> 01:14:33,830

planet and a good quality of life for

everybody on it?

Speaker:

01:14:34,411 --> 01:14:37,532

And I think there is a path that gets us

there.

Speaker:

01:14:37,972 --> 01:14:42,474

It's a little hard to believe when you

focus on the problems that we currently

Speaker:

01:14:42,474 --> 01:14:43,234

see.

Speaker:

01:14:43,294 --> 01:14:45,075

But I'm optimistic.

Speaker:

01:14:45,075 --> 01:14:48,716

I really do think we can solve climate

change.

Speaker:

01:14:50,934 --> 01:14:55,056

the slow process of making things better.

Speaker:

01:14:55,557 --> 01:15:01,381

If you look at history on a long enough

term, you will find that almost everything

Speaker:

01:15:01,381 --> 01:15:08,826

is getting better in ways that are often

invisible, because bad things happen

Speaker:

01:15:08,826 --> 01:15:14,189

quickly and visibly, and good things

happen slowly and in the background.

Speaker:

01:15:14,770 --> 01:15:19,633

But my hope for the 21st century is that

we will continue to make slow, gradual

Speaker:

01:15:19,633 --> 01:15:20,593

progress

Speaker:

01:15:21,154 --> 01:15:23,916

and a good ending for everybody on the

planet.

Speaker:

01:15:23,916 --> 01:15:26,017

So that's what I want to work on.

Speaker:

01:15:26,017 --> 01:15:33,301

Yeah, I love the optimistic tone to close

out the show.

Speaker:

01:15:34,202 --> 01:15:38,164

And second question, if you could have

dinner with any great scientific mind,

Speaker:

01:15:38,164 --> 01:15:40,246

then it would be a lot more fictional.

Speaker:

01:15:40,246 --> 01:15:42,107

Who would it be?

Speaker:

01:15:43,007 --> 01:15:45,308

I think I'm going to argue with the

question.

Speaker:

01:15:46,694 --> 01:15:51,137

I think it's based on this idea of great

scientific minds, which is a little bit

Speaker:

01:15:51,137 --> 01:15:56,260

related to the great person theory of

history, which is that big changes come

Speaker:

01:15:56,260 --> 01:15:59,862

from unique, special individuals.

Speaker:

01:16:00,423 --> 01:16:02,104

I'm not sure I buy it.

Speaker:

01:16:02,104 --> 01:16:06,667

I think the thing about science that is

exciting to me is that it is a social

Speaker:

01:16:06,667 --> 01:16:07,828

enterprise.

Speaker:

01:16:07,948 --> 01:16:10,509

It is intrinsically collaborative.

Speaker:

01:16:10,570 --> 01:16:12,150

It is cumulative.

Speaker:

01:16:17,462 --> 01:16:21,825

Making large contributions, I think, very

often is the right person in the right

Speaker:

01:16:21,825 --> 01:16:23,506

place at the right time.

Speaker:

01:16:24,167 --> 01:16:28,450

And I think often they deserve that

recognition.

Speaker:

01:16:29,351 --> 01:16:32,573

But even then, I'm going to say it's the

system.

Speaker:

01:16:32,874 --> 01:16:36,336

It's the social enterprise of science that

makes progress.

Speaker:

01:16:36,977 --> 01:16:41,020

So that's, I want to have dinner with the

social enterprise of science.

Speaker:

01:16:42,081 --> 01:16:44,503

Well, you call me if you know how to do

that.

Speaker:

01:16:45,424 --> 01:16:46,245

But yeah.

Speaker:

01:16:46,245 --> 01:16:46,945

I mean.

Speaker:

01:16:47,126 --> 01:16:52,149

Choking aside, I completely agree with you

and I think also it's a very good reminder

Speaker:

01:16:52,490 --> 01:16:57,074

to say it right now because we're

recording very close to the time where

Speaker:

01:16:57,074 --> 01:17:05,341

Nobel prizes are awarded and yeah, these

participate in the fame, like making

Speaker:

01:17:05,341 --> 01:17:11,846

science basically kind of like another

movie industry or industries like that are

Speaker:

01:17:11,846 --> 01:17:15,369

played by just fame.

Speaker:

01:17:16,598 --> 01:17:19,018

and all that comes with it.

Speaker:

01:17:19,018 --> 01:17:23,179

And yeah, I completely agree that this is

especially a big problem in science

Speaker:

01:17:23,179 --> 01:17:29,481

because scientists are often specialized

in a very small part of their field.

Speaker:

01:17:29,481 --> 01:17:36,843

And usually for me, it's a red flag, and

that happened a lot in COVID, where some

Speaker:

01:17:36,843 --> 01:17:41,464

scientists started talking about

epidemiology, whereas it was not their

Speaker:

01:17:43,305 --> 01:17:43,965

specialty.

Speaker:

01:17:43,965 --> 01:17:44,565

And

Speaker:

01:17:45,622 --> 01:17:48,744

To me, usually that's a red flag, but the

problem is that if they are very

Speaker:

01:17:48,744 --> 01:17:53,768

well-known scientists who may end up

having the Nobel Prize, well, then

Speaker:

01:17:53,768 --> 01:17:56,450

everybody listens to them, even though

they probably shouldn't.

Speaker:

01:17:56,811 --> 01:18:02,615

When you rely too much on fame and

popularity, that's a huge problem.

Speaker:

01:18:02,715 --> 01:18:11,042

Just trying to make heroes is a big

problem because it helps from a narrative

Speaker:

01:18:11,042 --> 01:18:13,804

perspective to make people interested in

science.

Speaker:

01:18:16,214 --> 01:18:19,414

basically that people start learning about

them.

Speaker:

01:18:19,514 --> 01:18:23,755

But there is a limit where it also

decorates people.

Speaker:

01:18:24,696 --> 01:18:30,217

Because, you know, if it's that hard, if

you have to be that smart, if you have to

Speaker:

01:18:30,217 --> 01:18:37,959

be Einstein or Oppenheimer or any of these

big or Laplace, you know, then it's just

Speaker:

01:18:37,959 --> 01:18:40,600

like, you don't even want to start.

Speaker:

01:18:46,206 --> 01:18:47,306

working on this.

Speaker:

01:18:47,306 --> 01:18:54,088

And that's a big problem because as you're

saying, progress for scientific progress

Speaker:

01:18:54,088 --> 01:18:59,469

is small incremental steps done by

community that works together.

Speaker:

01:19:00,390 --> 01:19:04,691

And there is competition of course, but

that really works together.

Speaker:

01:19:04,691 --> 01:19:11,893

And yeah, if you start implying that most

of that is just you have to be a once in a

Speaker:

01:19:11,893 --> 01:19:14,393

century genius to make science.

Speaker:

01:19:14,478 --> 01:19:19,580

We're going to have problems, especially

HR problems in the universities.

Speaker:

01:19:19,580 --> 01:19:21,561

So yeah, no, you don't need that.

Speaker:

01:19:21,561 --> 01:19:28,563

And also you're right that if you look

into the previous work, like even for

Speaker:

01:19:28,563 --> 01:19:33,986

Einstein, the idea of relativity was

already there in the time.

Speaker:

01:19:34,306 --> 01:19:40,188

If you look at some writings from

Poincaré, one of the main French

Speaker:

01:19:40,188 --> 01:19:43,049

mathematicians of the 20th century.

Speaker:

01:19:43,114 --> 01:19:47,996

already Poincaré just a few years before

Einstein is already talking about this

Speaker:

01:19:47,996 --> 01:19:50,978

idea of relativity and you can see the

equations also in one of his books

Speaker:

01:19:50,978 --> 01:19:52,959

previous to Einstein's publications.

Speaker:

01:19:52,959 --> 01:20:00,163

So it's like often it's, as you were

saying, an incredible person that's also

Speaker:

01:20:00,163 --> 01:20:08,227

here at the right time, at the right

place, who is in the ideas of his time.

Speaker:

01:20:08,308 --> 01:20:10,729

So that's also very important to

highlight.

Speaker:

01:20:10,729 --> 01:20:12,029

I completely agree with that.

Speaker:

01:20:13,362 --> 01:20:17,745

Yeah, in almost every case that you look

at, if you ask the question, if this

Speaker:

01:20:17,745 --> 01:20:22,828

person had not done X, when would it have

happened?

Speaker:

01:20:22,848 --> 01:20:24,549

Or who else might have done it?

Speaker:

01:20:24,549 --> 01:20:30,653

And almost every time the ideas were

there, they would have come together.

Speaker:

01:20:30,653 --> 01:20:36,997

Yeah, maybe a bit later, or even maybe a

bit earlier, we never know.

Speaker:

01:20:36,997 --> 01:20:41,040

But yeah, that's definitely the case.

Speaker:

01:20:41,060 --> 01:20:43,166

And I think the best

Speaker:

01:20:43,166 --> 01:20:49,529

proxy to the dinner we wanted to have is

to have a dinner with the LBS community.

Speaker:

01:20:49,749 --> 01:20:54,652

So we should organize that, you know, like

an LBS dinner where everybody can join.

Speaker:

01:20:55,472 --> 01:20:57,053

That would actually be very fun.

Speaker:

01:20:57,053 --> 01:20:58,414

Maybe one day I'll get to do that.

Speaker:

01:20:58,414 --> 01:21:05,918

One of my wildest dreams is to organize a,

you know, live episode somewhere where

Speaker:

01:21:05,918 --> 01:21:12,241

people could come join the show live and

have a live audience and so on.

Speaker:

01:21:13,206 --> 01:21:15,107

We'll see if I can do that one day.

Speaker:

01:21:15,788 --> 01:21:19,991

If you have ideas or opportunities, feel

free to let me know.

Speaker:

01:21:19,991 --> 01:21:22,953

And I think about it.

Speaker:

01:21:25,022 --> 01:21:25,542

Awesome.

Speaker:

01:21:25,542 --> 01:21:27,642

Alain, let's call it a show.

Speaker:

01:21:27,642 --> 01:21:30,083

I could really record with you for like

three hours.

Speaker:

01:21:30,083 --> 01:21:36,125

I literally still have a lot of questions

on my cheat sheet, but let's call it a

Speaker:

01:21:36,125 --> 01:21:41,846

show and allow you to go to your main

activities for the day.

Speaker:

01:21:41,846 --> 01:21:44,407

So thank you a lot, Alain.

Speaker:

01:21:44,527 --> 01:21:48,128

As I was saying, I put a lot of resources

and a link to your website in the show

Speaker:

01:21:48,128 --> 01:21:49,968

notes for those who want to dig deeper.

Speaker:

01:21:50,289 --> 01:21:53,449

Thanks again, Alain, for taking the time

and being on this show.

Speaker:

01:21:54,210 --> 01:21:54,670

Thank you.

Speaker:

01:21:54,670 --> 01:21:55,470

It's been really great.

Speaker:

01:21:55,470 --> 01:21:58,371

It's always a pleasure to talk with you.

Speaker:

01:21:58,371 --> 01:21:58,612

Yeah.

Speaker:

01:21:58,612 --> 01:22:02,793

Feel free to come back to the show and

answer the last two questions for a third

Speaker:

01:22:02,793 --> 01:22:03,534

time.

Previous post
Next post