Learning Bayesian Statistics

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

If there is one guest I don’t need to introduce, it’s mister Andrew Gelman. So… I won’t! I will refer you back to his two previous appearances on the show though, because learning from Andrew is always a pleasure. So go ahead and listen to episodes 20 and 27.

In this episode, Andrew and I discuss his new book, Active Statistics, which focuses on teaching and learning statistics through active student participation. Like this episode, the book is divided into three parts: 1) The ideas of statistics, regression, and causal inference; 2) The value of storytelling to make statistical concepts more relatable and interesting; 3) The importance of teaching statistics in an active learning environment, where students are engaged in problem-solving and discussion.

And Andrew is so active and knowledgeable that we of course touched on a variety of other topics — but for that, you’ll have to listen 😉

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary and Blake Walters.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

Takeaways:

– Active learning is essential for teaching and learning statistics.

– Storytelling can make statistical concepts more relatable and interesting.

– Teaching statistics in an active learning environment engages students in problem-solving and discussion.

– The book Active Statistics includes 52 stories, class participation activities, computer demonstrations, and homework assignments to facilitate active learning.

– Active learning, where students actively engage with the material through activities and discussions, is an effective approach to teaching statistics.

– The flipped classroom model, where students read and prepare before class and engage in problem-solving activities during class, can enhance learning and understanding.

– Clear organization and fluency in teaching statistics are important for student comprehension and engagement.

– Visualization plays a crucial role in understanding statistical concepts and aids in comprehension.

– The future of statistical education may involve new approaches and technologies, but the challenge lies in finding effective ways to teach basic concepts and make them relevant to real-world problems.

Chapters:

00:00 Introduction and Background

08:09 The Importance of Stories in Statistics Education

30:28 Using ‘Two Truths and a Lie’ to Teach Logistic Regression

38:08 The Power of Storytelling in Teaching Statistics

57:26 The Importance of Visualization in Understanding Statistics

01:07:03 The Future of Statistical Education

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.

Transcript
Speaker:

If there is one guest I don't need to

introduce, it is Mr.

2

00:00:09,242 --> 00:00:10,782

Andrew Gammann.

3

00:00:10,782 --> 00:00:12,552

So I won't.

4

00:00:12,552 --> 00:00:18,062

I will refer you back to his two previous

appearances on the show, though, because

5

00:00:18,062 --> 00:00:20,042

learning from Andrew is always a pleasure.

6

00:00:20,042 --> 00:00:23,942

So go ahead and listen to episodes 20 and

27.

7

00:00:23,942 --> 00:00:25,982

The links are in the show notes.

8

00:00:25,982 --> 00:00:30,190

In this episode, Andrew and I discuss his

new book, Active Statistics,

9

00:00:30,190 --> 00:00:33,830

which focuses on teaching and learning

statistics through active student

10

00:00:33,830 --> 00:00:34,890

participation.

11

00:00:34,890 --> 00:00:37,910

Like this episode, the book is divided

into three parts.

12

00:00:37,910 --> 00:00:41,930

One, the ideas of statistics regression

and causal inference.

13

00:00:41,930 --> 00:00:46,730

Two, the value of storytelling to make

statistical concepts more relatable and

14

00:00:46,730 --> 00:00:47,510

interesting.

15

00:00:47,510 --> 00:00:51,450

And three, the importance of teaching

statistics in an active learning

16

00:00:51,450 --> 00:00:56,370

environment where students are engaged in

problem solving and discussion.

17

00:00:56,830 --> 00:00:59,886

And well, Andrew is so active and

knowledgeable,

18

00:00:59,886 --> 00:01:05,766

that we of course touched on a variety of

their topics, but for that, you'll have to

19

00:01:05,766 --> 00:01:06,466

listen.

20

00:01:06,466 --> 00:01:13,066

This is Learning Basis Statistics, episode

106, recorded April 2, 2024.

21

00:01:26,510 --> 00:01:34,510

Welcome to Learning Bayesian Statistics, a

podcast about Bayesian inference, the

22

00:01:34,510 --> 00:01:38,090

methods, the projects, and the people who

make it possible.

23

00:01:38,090 --> 00:01:40,320

I'm your host, Alex Andorra.

24

00:01:40,320 --> 00:01:44,690

You can follow me on Twitter at alex

.andorra, like the country.

25

00:01:44,690 --> 00:01:48,950

For any info about the show, learnbaystats

.com is left last to be.

26

00:01:48,950 --> 00:01:53,790

Show notes, becoming a corporate sponsor,

unlocking Bayesian Merge, supporting the

27

00:01:53,790 --> 00:01:56,430

show on Patreon, everything is in there.

28

00:01:56,430 --> 00:01:58,270

That's LearnBasedStats .com.

29

00:01:58,270 --> 00:02:02,710

If you're interested in one -on -one

mentorship, online courses, or statistical

30

00:02:02,710 --> 00:02:07,890

consulting, feel free to reach out and

book a call at topmate .io slash alex

31

00:02:07,890 --> 00:02:09,770

underscore and dora.

32

00:02:09,770 --> 00:02:13,638

See you around, folks, and best patient

wishes to you all.

33

00:02:37,038 --> 00:02:44,158

on LBS now, so for curious listeners, I

definitely recommend episode 20, which was

34

00:02:44,158 --> 00:02:46,818

your first one with Andrew Gell -Mann.

35

00:02:46,818 --> 00:02:47,998

Yes, you were here.

36

00:02:47,998 --> 00:02:53,277

And with Akive Tali and Jennifer Hale, it

was both your previous book, Regression

37

00:02:53,277 --> 00:02:54,738

and Other Stories.

38

00:02:54,798 --> 00:03:03,338

And then episode 27 with Marilyn

Heidemann, where we talked about the 2020

39

00:03:03,338 --> 00:03:05,938

US presidential elections.

40

00:03:06,030 --> 00:03:11,150

We talked about the model you folks did

for the economists.

41

00:03:11,510 --> 00:03:15,610

So definitely recommend checking this one

out because I'm guessing this is going to

42

00:03:15,610 --> 00:03:18,410

be interesting also for this year's

election.

43

00:03:18,410 --> 00:03:21,970

Yeah, we're working with them for 2024 as

well.

44

00:03:21,970 --> 00:03:24,490

So we're trying to improve the model.

45

00:03:24,910 --> 00:03:25,590

Perfect.

46

00:03:25,590 --> 00:03:25,970

Yeah.

47

00:03:25,970 --> 00:03:30,690

So it seems like you're releasing a book

every four year just before the US

48

00:03:30,690 --> 00:03:31,850

election.

49

00:03:34,958 --> 00:03:38,598

I hope it won't be four years before an

Xbook comes out.

50

00:03:38,598 --> 00:03:41,858

We're trying to finish our Bayesian

workflow book.

51

00:03:41,858 --> 00:03:46,978

So we're hoping that will be done by the

end of the year.

52

00:03:47,638 --> 00:03:50,738

Well, yeah, definitely curious to check

this one out.

53

00:03:50,738 --> 00:03:56,398

I think I also saw that you're working on

an MRP update book.

54

00:03:56,398 --> 00:03:58,318

Is that still the case?

55

00:03:58,458 --> 00:04:03,918

Yeah, I think Yajuan and some Lauren...

56

00:04:03,918 --> 00:04:11,678

Uh, Kennedy and some other people are

organizing this, um, uh, MRP book edited

57

00:04:11,678 --> 00:04:13,518

book we're putting together.

58

00:04:13,518 --> 00:04:14,258

Yeah.

59

00:04:14,258 --> 00:04:16,498

Um, I will definitely check these out.

60

00:04:16,498 --> 00:04:20,898

Well, writing books is a lot of fun

because you can write whatever you want

61

00:04:20,898 --> 00:04:23,038

because you're trying to communicate with

the audience.

62

00:04:23,038 --> 00:04:29,018

When you write an article, you're trying

to communicate with the reviewers who

63

00:04:29,018 --> 00:04:30,038

aren't the readers.

64

00:04:30,038 --> 00:04:32,178

It's a very weird indirect thing.

65

00:04:32,178 --> 00:04:32,494

It's.

66

00:04:32,494 --> 00:04:37,294

I guess similarly, if you're trying to

write a TV show, you have to convince the

67

00:04:37,294 --> 00:04:41,014

TV network to produce the show, but

they're not the people who are watching it

68

00:04:41,014 --> 00:04:43,114

and articles are like that too.

69

00:04:43,114 --> 00:04:44,454

But a book is so simple.

70

00:04:44,454 --> 00:04:47,214

You just write a book and you're just

aiming to reach people.

71

00:04:47,214 --> 00:04:48,894

It's very pleasant.

72

00:04:48,914 --> 00:04:50,974

I recommend it.

73

00:04:51,514 --> 00:04:51,894

Yeah.

74

00:04:51,894 --> 00:04:56,134

I can see that it's something you really

enjoy because you're such a prolific

75

00:04:56,134 --> 00:04:57,734

author.

76

00:04:58,554 --> 00:04:59,574

Yeah.

77

00:04:59,574 --> 00:05:00,878

I am.

78

00:05:00,878 --> 00:05:06,118

Personally, I use MRP quite a lot and

often, so I'm definitely super curious to

79

00:05:06,118 --> 00:05:08,728

see what's going to be in this book.

80

00:05:08,728 --> 00:05:12,618

I'm sure I'm going to learn things

personally, and that's also going to help

81

00:05:12,618 --> 00:05:17,078

me teach MRP, which I'm doing from time to

time.

82

00:05:17,098 --> 00:05:17,978

Thanks a lot.

83

00:05:17,978 --> 00:05:23,678

We have a research project I'm very

excited about now, which is integrating

84

00:05:23,678 --> 00:05:26,718

survey weights into MRPs.

85

00:05:26,738 --> 00:05:28,334

So people do it now, though.

86

00:05:28,334 --> 00:05:32,094

They'll think they'll run weighted

regression or they'll do like in, they'll

87

00:05:32,094 --> 00:05:36,894

have the model in stand and use power

likelihood, but it's not really quite

88

00:05:36,894 --> 00:05:37,214

right.

89

00:05:37,214 --> 00:05:41,174

So we have what I think is a better

approach, but that's not what you have me

90

00:05:41,174 --> 00:05:42,044

here today, right?

91

00:05:42,044 --> 00:05:46,014

Here I'm supposed to talk about our active

statistics book, my new book with Aki.

92

00:05:46,014 --> 00:05:46,854

Yeah, yeah.

93

00:05:46,854 --> 00:05:48,214

Yeah, exactly.

94

00:05:48,214 --> 00:05:54,474

I would, we can put whatever you want, but

yeah, the main focus is going to be your

95

00:05:54,474 --> 00:05:55,278

new book.

96

00:05:55,278 --> 00:05:58,918

Active Statistics with Akira Etari.

97

00:05:59,978 --> 00:06:09,518

And yeah, so maybe can you give us an idea

of the genesis of the book and thanks for

98

00:06:09,518 --> 00:06:11,998

showing up the book on the video.

99

00:06:11,998 --> 00:06:14,018

So those watching on YouTube.

100

00:06:14,298 --> 00:06:19,618

So it's for people learning statistics or

teaching statistics.

101

00:06:19,618 --> 00:06:22,862

So the story is that everybody says you

102

00:06:22,862 --> 00:06:27,322

Want to do active learning so students

should be working together class class

103

00:06:27,322 --> 00:06:32,802

time should be an active time for students

to be thinking about problems discussing

104

00:06:32,802 --> 00:06:33,962

problems.

105

00:06:33,962 --> 00:06:38,982

I notice so what.

106

00:06:39,062 --> 00:06:46,562

Okay, I teach a class based on regression

other stories and it's two semesters and

107

00:06:46,562 --> 00:06:50,682

each semester is 13 weeks and each week

has two classes.

108

00:06:50,682 --> 00:06:52,654

So that's 52 classes.

109

00:06:52,654 --> 00:06:59,354

And we cover the book every class is an

hour and a half long, or I guess, seventy

110

00:06:59,354 --> 00:07:01,234

five minutes long and each class.

111

00:07:01,234 --> 00:07:07,414

I have a story a class participation

activity, a computer, a computer

112

00:07:07,414 --> 00:07:12,774

demonstration, some quick drills for

students to work on in class, and then the

113

00:07:12,774 --> 00:07:16,794

discussion problem for students to talk

about and think more.

114

00:07:17,134 --> 00:07:21,814

I don't always have time in every class to

do all of these, but sometimes I do and I

115

00:07:21,814 --> 00:07:24,834

can always do most of them.

116

00:07:24,854 --> 00:07:32,174

I found when I had been teaching

statistics, I told stories a lot, but what

117

00:07:32,174 --> 00:07:37,494

happened, it's tricky to tell a story,

partly because for other, not every

118

00:07:37,494 --> 00:07:41,054

teacher has a lot of experience, so they

don't always have a lot of good stories.

119

00:07:41,054 --> 00:07:41,390

So,

120

00:07:41,390 --> 00:07:43,000

So, okay, so our book, it's okay.

121

00:07:43,000 --> 00:07:47,250

Our book is 52 stories, 52 class

participation activities, 52 computer

122

00:07:47,250 --> 00:07:49,590

demonstrations, et cetera, one for each

class.

123

00:07:49,590 --> 00:07:54,490

So, first, these are 52 stories that are

pretty good that I've come up with or that

124

00:07:54,490 --> 00:07:57,250

Aki and I have encountered in our careers.

125

00:07:57,250 --> 00:08:02,210

So, there are high quality stories, but

also when you tell a story, when I tell a

126

00:08:02,210 --> 00:08:05,210

story in class, sometimes it gets a little

disorganized.

127

00:08:05,210 --> 00:08:06,280

So, it worked good.

128

00:08:06,280 --> 00:08:09,126

It worked well to write the stories down.

129

00:08:09,198 --> 00:08:15,638

And for each story, we very explicitly say

how it connects to the week's topic, the

130

00:08:15,638 --> 00:08:19,078

week's reading, and also how it connects

to the course as a whole.

131

00:08:19,078 --> 00:08:21,208

And I felt that had been missing before.

132

00:08:21,208 --> 00:08:24,578

It wasn't hard for me to tell an

entertaining story with statistical

133

00:08:24,578 --> 00:08:29,358

content, but I wasn't always making that

connection with what was happening in

134

00:08:29,358 --> 00:08:29,858

class.

135

00:08:29,858 --> 00:08:34,878

So I feel that if you're a student and you

want to learn statistics, you can read

136

00:08:34,878 --> 00:08:36,301

these stories and...

137

00:08:36,301 --> 00:08:37,541

There are great little stories.

138

00:08:37,541 --> 00:08:41,281

There aren't a lot of sources for

statistics stories out there.

139

00:08:41,281 --> 00:08:44,141

Textbooks tend to have boring examples.

140

00:08:44,141 --> 00:08:47,921

They want to set it up like here's how to

turn the crank.

141

00:08:47,921 --> 00:08:51,831

Sometimes textbooks tell stories, but they

don't tell them well.

142

00:08:51,831 --> 00:08:55,461

And I'll give you an example of that in a

moment.

143

00:08:55,921 --> 00:08:58,561

There isn't really anything like this.

144

00:08:58,561 --> 00:09:01,281

And so maybe we should have just had a

little book.

145

00:09:01,281 --> 00:09:03,111

Our book is, how long is it?

146

00:09:03,111 --> 00:09:05,549

It's three and two fifty pages long.

147

00:09:05,549 --> 00:09:10,229

Maybe we should have had just a book that

was like 50 or 100 pages long with just

148

00:09:10,229 --> 00:09:13,169

the stories, because that already is

great.

149

00:09:13,169 --> 00:09:16,169

Maybe it should have been several

pamphlets rather than one book.

150

00:09:16,169 --> 00:09:19,389

Then we have class participation

activities.

151

00:09:19,389 --> 00:09:22,349

These are things where the class gets

involved.

152

00:09:22,349 --> 00:09:25,965

They're filling out survey forms or.

153

00:09:25,965 --> 00:09:31,125

they're doing an experiment on each other

or we do an experiment on them or they're

154

00:09:31,125 --> 00:09:35,765

weighing bags of things and trying to get

estimates, they're flipping coins.

155

00:09:35,765 --> 00:09:37,765

I love these.

156

00:09:37,765 --> 00:09:41,485

Deb Nolan and I had a book a few years

ago, Teaching Statistics, A Bag of Tricks,

157

00:09:41,485 --> 00:09:45,285

which had a few activities, but this is a

million times better.

158

00:09:45,285 --> 00:09:50,565

First, we didn't have 52 activities, but

also these are lined up with the course.

159

00:09:50,565 --> 00:09:52,045

So they go in sequence.

160

00:09:52,045 --> 00:09:54,105

So they're not just fun things to do.

161

00:09:54,105 --> 00:09:56,545

There are things that line up with

particular lessons.

162

00:09:56,545 --> 00:10:01,725

And I just love that people tell me

they'll say, Oh, I liked your book and I

163

00:10:01,725 --> 00:10:04,045

used one of your activities in one of my

classes.

164

00:10:04,045 --> 00:10:08,264

And it makes you want to scream and like,

you know, throw something at the TV or

165

00:10:08,264 --> 00:10:09,505

punch the wall or whatever.

166

00:10:09,505 --> 00:10:13,765

I want you to do it in every class, every

class should have an activity or at least

167

00:10:13,765 --> 00:10:15,165

most of the time.

168

00:10:15,525 --> 00:10:19,745

So that was a lot of effort because we had

a bunch, but a bunch of them, like we just

169

00:10:19,745 --> 00:10:21,357

created from scratch.

170

00:10:21,357 --> 00:10:23,057

We need an activity for this.

171

00:10:23,057 --> 00:10:23,917

And that's really great.

172

00:10:23,917 --> 00:10:27,557

So that could have been its own pamphlet,

another 50 pages.

173

00:10:27,557 --> 00:10:30,227

Then we have computer demonstrations.

174

00:10:30,227 --> 00:10:33,897

And I find that live demos are great.

175

00:10:33,897 --> 00:10:38,377

But if you try to do it from scratch, you

get tangled in the code.

176

00:10:38,377 --> 00:10:41,437

So it's good to have pre -written live

demos.

177

00:10:41,877 --> 00:10:44,257

And so that's like to say you should have

a demo.

178

00:10:44,257 --> 00:10:45,797

And it's surprisingly hard.

179

00:10:45,797 --> 00:10:49,917

You create even something simple, simulate

fake data and run a regression.

180

00:10:50,029 --> 00:10:53,049

You have to have good values of the

parameters or else you're not really

181

00:10:53,049 --> 00:10:54,549

demonstrating the point you want to make.

182

00:10:54,549 --> 00:10:57,009

If it has some curvature, how much to

have.

183

00:10:57,009 --> 00:11:00,809

So we tested them out and did them in

class.

184

00:11:00,889 --> 00:11:04,929

And so that way when I teach, I can always

have a live demo, which is everybody's

185

00:11:04,929 --> 00:11:07,749

favorite part of class and so forth with

the others.

186

00:11:07,749 --> 00:11:12,789

And then we have some homework assignments

and we have some chapters at the beginning

187

00:11:12,789 --> 00:11:16,769

where we talk about how to set up the

class and how to learn better.

188

00:11:16,769 --> 00:11:19,597

It's not really just for teachers, as I

said, should be for.

189

00:11:19,597 --> 00:11:20,497

students.

190

00:11:20,497 --> 00:11:22,957

So that's what's in it.

191

00:11:22,957 --> 00:11:24,747

Yeah, well, thanks a lot, Andrew.

192

00:11:24,747 --> 00:11:30,097

I already have a lot of follow -up

questions for you.

193

00:11:30,497 --> 00:11:38,257

But something also you've told me in

preparing the episode is that you have

194

00:11:38,257 --> 00:11:43,637

thought about the book in three distinct

parts.

195

00:11:43,637 --> 00:11:48,537

All right, so first one being the idea of

statistics, regression and causal

196

00:11:48,537 --> 00:11:49,325

inference.

197

00:11:49,325 --> 00:11:54,485

Then another pillar would be like using

stories to explain statistics.

198

00:11:54,485 --> 00:11:59,045

And the third pillar would be the method

of teaching with active student

199

00:11:59,045 --> 00:12:00,305

participation.

200

00:12:00,465 --> 00:12:08,165

So why did you choose these three

different pillars and how do you think

201

00:12:08,165 --> 00:12:13,665

they are helping an active learning of

statistics, which is one of the goals of

202

00:12:13,665 --> 00:12:14,785

your book?

203

00:12:15,305 --> 00:12:16,397

So.

204

00:12:16,397 --> 00:12:19,047

Teaching or learning is like a vector.

205

00:12:19,047 --> 00:12:21,457

It has a magnitude and a direction.

206

00:12:21,457 --> 00:12:25,657

And the magnitude is how hard you work to

figure stuff out.

207

00:12:25,657 --> 00:12:30,317

And the direction is what you're learning.

208

00:12:30,317 --> 00:12:34,977

So yeah, I think applied regression and

causal inference is super important.

209

00:12:35,197 --> 00:12:41,907

This typical audience for this book would

be students who took one statistics class.

210

00:12:41,907 --> 00:12:45,709

Maybe they already took statistics in high

school or at university.

211

00:12:45,709 --> 00:12:50,329

took that one class where they learned

about sampling and experimenting and

212

00:12:50,329 --> 00:12:54,429

estimation, intervals, normal

distribution, stuff like this.

213

00:12:54,429 --> 00:12:57,609

This is all about using it, about going

beyond that.

214

00:12:57,609 --> 00:12:59,709

So, yeah, I think applied statistics is

great.

215

00:12:59,709 --> 00:13:05,169

I want to teach regression about, like,

the most important thing is understanding

216

00:13:05,169 --> 00:13:07,229

the model and being able to use it.

217

00:13:07,229 --> 00:13:11,109

Not so much the mathematical theorem

about…

218

00:13:11,245 --> 00:13:13,315

least squares estimation.

219

00:13:13,315 --> 00:13:14,425

That's important too.

220

00:13:14,425 --> 00:13:16,985

There's other places to learn that.

221

00:13:17,805 --> 00:13:23,365

So yeah, the direction is that it's

applied statistics.

222

00:13:23,985 --> 00:13:29,025

I think the magnitude is about how to make

that work, how to get people to learn.

223

00:13:29,125 --> 00:13:35,025

And so most of the learning is not done in

class, but at least if students are doing

224

00:13:35,025 --> 00:13:36,333

these activities,

225

00:13:36,333 --> 00:13:39,833

in class that the hour and a half or the

three hours a week they're spending in

226

00:13:39,833 --> 00:13:43,333

class, they are already heavily thinking

about it.

227

00:13:43,333 --> 00:13:46,953

Which, and you know, I just like, it's

kind of horrible for the students because

228

00:13:46,953 --> 00:13:47,893

you really make them work.

229

00:13:47,893 --> 00:13:49,973

It's like teaching a foreign language

class, right?

230

00:13:49,973 --> 00:13:53,493

If you go and take a usual class in

college, you sit in the back and you zone

231

00:13:53,493 --> 00:13:55,613

out and you're like, oh, this is pleasant.

232

00:13:55,613 --> 00:13:58,053

It's like watching a movie, maybe.

233

00:13:58,053 --> 00:14:01,133

But if you're in a foreign language class,

you're working all the time, right?

234

00:14:01,133 --> 00:14:03,473

The teacher's always making you talk and

listen.

235

00:14:03,473 --> 00:14:05,933

If you lose focus for a second, it's...

236

00:14:05,933 --> 00:14:10,533

Difficult statistics is a foreign language

and you can learn by speaking it and

237

00:14:10,533 --> 00:14:16,093

practicing it So I think it's important in

class to be able to do that or if you're

238

00:14:16,093 --> 00:14:23,173

studying at home to have these activities

and stories That there isn't I mean, it's

239

00:14:23,173 --> 00:14:27,073

and of course the computer I'll say like

my computer code is pretty bad.

240

00:14:27,073 --> 00:14:28,143

So that's good, right?

241

00:14:28,143 --> 00:14:29,713

Because that's like student code.

242

00:14:29,713 --> 00:14:31,073

It's all crappy code.

243

00:14:31,073 --> 00:14:35,753

So it's realistic I know it's not the

world's cleanest always

244

00:14:36,237 --> 00:14:39,087

I would say, but it runs, but maybe it

doesn't all run either.

245

00:14:39,087 --> 00:14:41,717

It ran when I wrote it.

246

00:14:41,757 --> 00:14:46,657

But it's supposed to be, when I do code

demos in class, what I like to do is

247

00:14:46,657 --> 00:14:49,657

actually type in the code, not copy and

paste it.

248

00:14:49,657 --> 00:14:51,927

So that's modeling how someone might do

it.

249

00:14:51,927 --> 00:14:56,277

So we try to keep them short enough that

you can do that.

250

00:14:59,245 --> 00:15:03,745

Yeah, thanks a lot.

251

00:15:03,745 --> 00:15:09,365

I see what you're doing and I really

appreciate it because that's also helping

252

00:15:09,365 --> 00:15:22,865

me in my own teaching philosophy because I

do have the same experience where the

253

00:15:22,865 --> 00:15:27,325

students who end up learning the most are

usually the most active ones.

254

00:15:28,109 --> 00:15:32,789

but then the main question is, okay, how

do I make them all active?

255

00:15:32,789 --> 00:15:36,209

Or at least give them the opportunity to

all be active.

256

00:15:36,209 --> 00:15:38,409

And that's really one of the things.

257

00:15:38,409 --> 00:15:41,249

Yeah, when I teach, I make them talk.

258

00:15:41,249 --> 00:15:45,749

Like even it could be a class with 50 or

more students, but I'll tell the story and

259

00:15:45,749 --> 00:15:48,459

then I'll pause and then say, well, what

do you think?

260

00:15:48,459 --> 00:15:49,789

Talk to your neighbor about this.

261

00:15:49,789 --> 00:15:51,479

And I look and I make sure they're

talking.

262

00:15:51,479 --> 00:15:55,109

And if they're not talking, then I walk

over and say, you know, I go like this to

263

00:15:55,109 --> 00:15:55,597

them.

264

00:15:55,597 --> 00:16:01,357

and if their computer is out by look and

if they're on their social media, I ask

265

00:16:01,357 --> 00:16:04,257

them to close their computer and if their

phone is out, I ask them to close their

266

00:16:04,257 --> 00:16:05,957

phone and so forth.

267

00:16:06,017 --> 00:16:09,937

The funny thing is as a teacher, that's

hard, it's easier as a teacher to just

268

00:16:09,937 --> 00:16:12,657

talk and talk and talk and talk, like I'm

talking now, I'm just talking.

269

00:16:12,657 --> 00:16:16,197

It's easy to talk and you have complete

control over it.

270

00:16:16,197 --> 00:16:20,457

So that's why I really needed to structure

this in this way.

271

00:16:20,457 --> 00:16:23,405

That was my original motivation for all of

this.

272

00:16:23,405 --> 00:16:28,365

was that many years ago I was teaching a

class and I couldn't make it because I had

273

00:16:28,365 --> 00:16:33,145

my co -teacher, another faculty member in

the department was teaching the same level

274

00:16:33,145 --> 00:16:37,345

class, teach my class, and then I went and

taught hers and she said, oh, your

275

00:16:37,345 --> 00:16:38,765

students were just dead.

276

00:16:38,765 --> 00:16:43,265

And then I talked to her class and they

were so lively and I realized not that she

277

00:16:43,265 --> 00:16:48,065

was lucky, but that they had been in that

habit of participating in class.

278

00:16:48,065 --> 00:16:50,115

She's just a natural great teacher.

279

00:16:50,115 --> 00:16:51,985

I'm naturally not a good teacher.

280

00:16:51,985 --> 00:16:53,229

And so I...

281

00:16:53,229 --> 00:16:58,429

do this stick in order to get them

involved.

282

00:16:58,829 --> 00:17:00,729

And then I just wanted to do it well.

283

00:17:00,729 --> 00:17:06,069

I want to tell stories, but I want to be

able to make the point, to help them learn

284

00:17:06,069 --> 00:17:07,049

it.

285

00:17:08,029 --> 00:17:13,789

Yeah, that's interesting because me, when

the teachers were doing that to me, it's

286

00:17:13,789 --> 00:17:15,989

because I was talking too much.

287

00:17:16,869 --> 00:17:19,289

That happened quite a lot.

288

00:17:19,309 --> 00:17:22,189

Maybe that's why I have a podcast now.

289

00:17:22,669 --> 00:17:25,569

Apart from these philosophical

considerations.

290

00:17:27,089 --> 00:17:28,799

Yeah, that's very interesting.

291

00:17:28,799 --> 00:17:31,029

I'm going to try that in my own classes.

292

00:17:31,449 --> 00:17:37,709

The thing is I personally teach a lot of

online courses and so I cannot beender and

293

00:17:37,709 --> 00:17:38,869

see the screens.

294

00:17:38,869 --> 00:17:39,579

So that's pretty hard.

295

00:17:39,579 --> 00:17:41,089

Yeah, it's tough.

296

00:17:41,089 --> 00:17:46,509

I remember when I was doing the class over

Zoom and you could try to put them in a

297

00:17:46,509 --> 00:17:50,689

little room so they work in pairs, but yet

if you can't see them doing it, I think

298

00:17:50,689 --> 00:17:52,813

there is some online...

299

00:17:52,813 --> 00:17:57,913

conferencing software where you can

actually see the pairs and then then or

300

00:17:57,913 --> 00:18:02,613

the small groups, but I don't I don't know

the full story with that, but I could get

301

00:18:02,613 --> 00:18:03,793

so I gave you an example.

302

00:18:03,793 --> 00:18:06,313

There's something one of the things it's

difficult.

303

00:18:06,313 --> 00:18:06,953

I don't know.

304

00:18:06,953 --> 00:18:11,693

There's any answer about this about the

stories is that if they're too if they're

305

00:18:11,693 --> 00:18:13,563

too simple, that's boring.

306

00:18:13,563 --> 00:18:17,533

But if they're too complicated, then you

know, that's not good either.

307

00:18:17,613 --> 00:18:22,381

One thing I like to say like I I want to

send the message that.

308

00:18:22,381 --> 00:18:25,261

Statistics, how did I put it in the book?

309

00:18:25,261 --> 00:18:31,311

I had a slogan that statistics is hard.

310

00:18:31,311 --> 00:18:32,961

It should not feel tricky.

311

00:18:32,961 --> 00:18:34,671

So I don't like those.

312

00:18:34,671 --> 00:18:35,501

I don't like this.

313

00:18:35,501 --> 00:18:40,041

I like statistics stories with a twist,

but I don't like the kind of stories where

314

00:18:40,041 --> 00:18:43,981

the messages, this is just hard like this,

like at Monte Hall problem.

315

00:18:43,981 --> 00:18:47,981

I hate that because it's just so confusing

to people.

316

00:18:47,981 --> 00:18:50,101

Like, what's the lesson that you're

teaching?

317

00:18:50,101 --> 00:18:50,271

Right?

318

00:18:50,271 --> 00:18:51,811

Like, this is really, really confusing.

319

00:18:51,811 --> 00:18:53,041

I don't want to teach that.

320

00:18:53,041 --> 00:18:54,421

But here's an example.

321

00:18:54,421 --> 00:19:00,621

And this is a very standard example used

in United States statistics classes where

322

00:19:00,621 --> 00:19:04,181

we put another twist on it based on the

recent literature.

323

00:19:04,361 --> 00:19:11,571

So this was a survey that was done in 1936

by a magazine called the Literary Digest.

324

00:19:11,571 --> 00:19:15,405

And they did a very famous in statistics

books example.

325

00:19:15,405 --> 00:19:20,005

They did a survey for the presidential

election and it was the presidential

326

00:19:20,005 --> 00:19:24,665

election was Franklin Roosevelt running

for reelection against somebody who wasn't

327

00:19:24,665 --> 00:19:25,845

Franklin Roosevelt.

328

00:19:25,845 --> 00:19:28,465

So you kind of know who won that election.

329

00:19:28,465 --> 00:19:33,705

But in the their poll, actually, Franklin

Roosevelt was going to get destroyed.

330

00:19:33,705 --> 00:19:38,285

They did a poll with they they surveyed 10

million people and two and a half million

331

00:19:38,285 --> 00:19:39,585

of those responded.

332

00:19:39,585 --> 00:19:44,013

And out of that, it looked like Roosevelt

was completely getting smoked.

333

00:19:44,013 --> 00:19:46,053

Well, there were two things happening.

334

00:19:46,053 --> 00:19:50,893

One is the two and a half million

respondents were not random sample of the

335

00:19:50,893 --> 00:19:52,493

10 million people.

336

00:19:52,493 --> 00:19:56,073

Second, the 10 million people were

themselves not representative of Americans

337

00:19:56,073 --> 00:20:02,233

because it was from lists of people who

own cars and things like richer people.

338

00:20:02,413 --> 00:20:06,813

So it wasn't a representative sample and

usually it just stops there.

339

00:20:06,813 --> 00:20:11,245

But that's not a good place to stop for a

couple of reasons.

340

00:20:11,245 --> 00:20:13,755

One of which is what lesson are you

telling people?

341

00:20:13,755 --> 00:20:16,325

If you don't have a random sample, your

survey is no good.

342

00:20:16,325 --> 00:20:19,205

Well, unfortunately, no surveys are random

samples.

343

00:20:19,205 --> 00:20:22,085

I mean, no surveys of humans, no political

polls are.

344

00:20:22,085 --> 00:20:26,055

So the message would be, oh, you can't

ever trust any political poll.

345

00:20:26,055 --> 00:20:29,875

Well, that would be a mistake because

political polls, even when they're off,

346

00:20:29,875 --> 00:20:32,805

they tend only to be off by a couple of

percentage points.

347

00:20:32,805 --> 00:20:34,525

So what goes on with political?

348

00:20:34,525 --> 00:20:37,305

Well, so let's OK, so let's look at this

survey.

349

00:20:37,305 --> 00:20:40,557

The first thing is that the same magazine

had

350

00:20:40,557 --> 00:20:43,777

done this survey in previous elections and

it had worked well.

351

00:20:43,777 --> 00:20:45,157

So they had some track record.

352

00:20:45,157 --> 00:20:47,557

It wasn't as dumb as it sounds.

353

00:20:48,277 --> 00:20:53,877

Second thing, and this is something that

two statisticians recently looked into, I

354

00:20:53,877 --> 00:20:55,647

was able to take advantage of their work.

355

00:20:55,647 --> 00:21:01,217

So Sharon Lore and Michael Brick had

written a paper on this 1936 Literary

356

00:21:01,217 --> 00:21:06,937

Digest Survey where they realized that, or

the data from the survey are actually

357

00:21:06,937 --> 00:21:09,317

somewhere, like they're available.

358

00:21:10,221 --> 00:21:15,801

The, um, and one of the quest, the survey

asked people who they would vote for, but

359

00:21:15,801 --> 00:21:18,801

it also asked who they voted for in the

previous election.

360

00:21:18,801 --> 00:21:22,561

So you can adjust for that because you

know, the election outcome, the previous

361

00:21:22,561 --> 00:21:23,581

election outcome.

362

00:21:23,581 --> 00:21:24,501

Well, it's not perfect.

363

00:21:24,501 --> 00:21:26,961

It's not everybody voted in the previous

election.

364

00:21:26,961 --> 00:21:29,681

And, but it's pretty good.

365

00:21:29,681 --> 00:21:35,321

And when you do that adjustment, you get,

well, you find that Roosevelt was supposed

366

00:21:35,321 --> 00:21:35,711

to win.

367

00:21:35,711 --> 00:21:37,401

Well, it's not a perfect adjustment.

368

00:21:37,401 --> 00:21:38,661

It's still quite a bit off.

369

00:21:38,661 --> 00:21:39,021

It's.

370

00:21:39,021 --> 00:21:42,881

Even after doing this adjustment, it's

still not a representative sample.

371

00:21:42,881 --> 00:21:47,571

But now we've changed the lesson from,

hey, it's not a random sample, you fool,

372

00:21:47,571 --> 00:21:52,581

blah, blah, blah, to, hey, this sample is

not a representative sample, but

373

00:21:52,581 --> 00:21:54,421

statistics can be used to adjust it.

374

00:21:54,421 --> 00:21:55,561

Look at this.

375

00:21:55,561 --> 00:21:57,231

But the adjustment is imperfect.

376

00:21:57,231 --> 00:21:59,149

So it's a more subtle message.

377

00:21:59,149 --> 00:22:01,329

Well, it's trickier to teach.

378

00:22:01,329 --> 00:22:06,009

That's one reason why I like having the

story written as a story very clearly in

379

00:22:06,009 --> 00:22:09,799

the book, because then the student or the

teacher can read through the whole thing.

380

00:22:09,799 --> 00:22:11,169

If you're a student, you can read it

through.

381

00:22:11,169 --> 00:22:16,389

And if you're a teacher, you can first

read it before trying to teach it.

382

00:22:16,389 --> 00:22:17,409

And there it is.

383

00:22:17,409 --> 00:22:21,049

It's on page 36 and 37 of our book.

384

00:22:21,049 --> 00:22:24,069

There's a copy of the survey form.

385

00:22:24,289 --> 00:22:24,909

And.

386

00:22:24,909 --> 00:22:26,089

It takes it.

387

00:22:26,089 --> 00:22:33,669

It's it's literally like the takes up the

description takes up one one page of of

388

00:22:33,669 --> 00:22:34,949

the book.

389

00:22:35,129 --> 00:22:39,209

Almost almost all of it is a quote from

Lauren Brick because they're the ones who

390

00:22:39,209 --> 00:22:43,669

did it and then a little discussion of how

it relates to the class.

391

00:22:43,789 --> 00:22:46,949

But everything is like these stories are

all like that.

392

00:22:46,949 --> 00:22:49,269

Like they're all you have to balance it.

393

00:22:49,269 --> 00:22:53,293

And it's it's it's tricky like they almost

should be another.

394

00:22:53,293 --> 00:22:56,613

booklet of the really simple stories that

we've been including because they're too

395

00:22:56,613 --> 00:22:59,973

boring for me, but maybe still interesting

for the students.

396

00:23:00,513 --> 00:23:00,983

I don't know.

397

00:23:00,983 --> 00:23:02,333

We went back and forth.

398

00:23:02,333 --> 00:23:05,173

It's structured from beginning to end of

the course.

399

00:23:05,173 --> 00:23:09,033

So each sec, there's 20, well, there's a

couple of introductory chapters and then

400

00:23:09,033 --> 00:23:14,013

there's 13 sections for the first semester

and then 13 sections for the second

401

00:23:14,013 --> 00:23:14,443

semester.

402

00:23:14,443 --> 00:23:18,653

So most of the book is, is 13 straight, is

26 sections.

403

00:23:18,653 --> 00:23:21,805

And in each one we have a story and the

404

00:23:21,805 --> 00:23:23,145

participation activity.

405

00:23:23,145 --> 00:23:26,545

And we went back and forth about whether

to do it that way or whether to put all

406

00:23:26,545 --> 00:23:29,545

the stories in one place and all the

activities in one place.

407

00:23:29,545 --> 00:23:30,595

And I don't know.

408

00:23:30,595 --> 00:23:33,085

Now I'm thinking I wish we had done it

that way.

409

00:23:33,085 --> 00:23:37,565

But Aki and I went around and around on

this a million times.

410

00:23:37,565 --> 00:23:41,285

There's no, you don't need to hear about

this.

411

00:23:41,585 --> 00:23:43,715

I wanted it to look right.

412

00:23:43,715 --> 00:23:46,985

The thing is, if you opened up at random,

you might get a page of homework

413

00:23:46,985 --> 00:23:49,445

assignments and then it might look like a

textbook.

414

00:23:49,445 --> 00:23:51,341

So it's like, that's the...

415

00:23:51,341 --> 00:23:53,601

it all kind of looks the same.

416

00:23:53,601 --> 00:23:57,781

So maybe if we had separately done the

different things, it would have then

417

00:23:57,781 --> 00:24:00,060

there'd be a whole section of stories.

418

00:24:00,061 --> 00:24:03,781

But when you're teaching, it's convenient

that's in order because you just go to the

419

00:24:03,781 --> 00:24:06,131

week of your class and then you can see

what to do that week.

420

00:24:06,131 --> 00:24:09,341

So that's, I used it to teach.

421

00:24:11,277 --> 00:24:18,177

Yeah, and I mean, I really love also your

focus on the stories, right?

422

00:24:18,177 --> 00:24:25,797

I see it's definitely a theme of your work

recently, and I really love that because I

423

00:24:25,797 --> 00:24:34,057

think it also puts an emphasis on the fact

that statistics is not done in a vacuum,

424

00:24:34,057 --> 00:24:34,297

right?

425

00:24:34,297 --> 00:24:37,317

And it's also done by humans.

426

00:24:37,773 --> 00:24:42,853

with their biases and also their

motivations and so on.

427

00:24:42,853 --> 00:24:47,353

And I found that way more interesting, way

more realistic.

428

00:24:47,353 --> 00:24:52,693

And also that captures more the

imagination of the students rather than

429

00:24:52,693 --> 00:24:58,753

teaching them theorems and formula, which

often is quite intimidating to a lot of

430

00:24:58,753 --> 00:24:59,833

them.

431

00:24:59,993 --> 00:25:05,213

So yeah, I hope to admit the stories are

like all things that I can personally

432

00:25:05,213 --> 00:25:06,221

relate to.

433

00:25:06,221 --> 00:25:10,861

Like either there are things that I was, I

was either it's research I was involved in

434

00:25:10,861 --> 00:25:13,351

or it's something close enough to what I

do.

435

00:25:13,351 --> 00:25:16,061

Like I'm interested in the question being

asked.

436

00:25:16,061 --> 00:25:21,181

Um, it's yeah, there were, there were, and

the same with the same with the

437

00:25:21,181 --> 00:25:21,681

activities.

438

00:25:21,681 --> 00:25:23,691

The activities have a lot of simulated

data.

439

00:25:23,691 --> 00:25:25,241

I'm a big fan of.

440

00:25:25,241 --> 00:25:27,901

Yeah, you are.

441

00:25:27,901 --> 00:25:31,001

Uh, in a, in a lot of your books, you, you

took up with that.

442

00:25:31,001 --> 00:25:33,677

Um, do you want to, do you want to talk

about.

443

00:25:33,677 --> 00:25:40,297

bit more about that or you think we've

covered already the idea of simulated data

444

00:25:40,297 --> 00:25:42,147

in the traditional data?

445

00:25:42,147 --> 00:25:47,037

Well, I'll just say briefly that I think

we are, as statisticians or computer

446

00:25:47,037 --> 00:25:50,757

scientists or whatever, we're used to the

idea of here is a data set, let's see what

447

00:25:50,757 --> 00:25:52,177

we can learn.

448

00:25:52,697 --> 00:25:58,017

But science, I mean, sometimes we proceed

that way in learning.

449

00:25:58,017 --> 00:26:01,537

We want to understand the world, you're

curious about something, someone gets a

450

00:26:01,537 --> 00:26:03,869

bunch of data from

451

00:26:03,981 --> 00:26:07,451

Basketball or whatever, and then you play

around and see what you can get.

452

00:26:07,451 --> 00:26:10,861

So that happens, but often things are more

directly motivated.

453

00:26:10,861 --> 00:26:15,581

Like, yes, in a public opinion poll,

you're really starting with the question.

454

00:26:15,801 --> 00:26:23,381

When in demonstrating a method encoding

examples, it's super great to have

455

00:26:23,381 --> 00:26:24,877

simulation.

456

00:26:24,877 --> 00:26:28,057

partly because it's like it's the dual

problem, right?

457

00:26:28,057 --> 00:26:30,697

If I can, I simulate the data, then I fit

the model.

458

00:26:30,697 --> 00:26:34,687

I can check, I can see if the parameter

estimates are similar to the true value,

459

00:26:34,687 --> 00:26:40,497

but also just the active simulation is the

time reversal of the active inference.

460

00:26:40,497 --> 00:26:44,297

So it makes sense to show the forward

process too.

461

00:26:44,297 --> 00:26:46,677

And I think it's kind of a bit of a power

thing.

462

00:26:46,677 --> 00:26:49,067

It's a student, like I can state, I can

simulate data.

463

00:26:49,067 --> 00:26:51,937

I can make fake data myself, right?

464

00:26:51,937 --> 00:26:52,749

That's.

465

00:26:52,749 --> 00:26:54,449

That's something that can be done.

466

00:26:54,449 --> 00:26:59,169

Traditionally, we do simulation when we're

teaching probability, like you'll teach

467

00:26:59,169 --> 00:27:02,589

the central limit theorem by simulating

draws.

468

00:27:03,729 --> 00:27:06,449

But just a lot of examples come up.

469

00:27:06,449 --> 00:27:11,509

It's very simulation is a kind of it's

like a universal solvent.

470

00:27:11,509 --> 00:27:15,749

Like, for example, I think one of our

discussion problems in classes, I show

471

00:27:15,749 --> 00:27:18,669

them data from some regression, which is

based on real data.

472

00:27:18,669 --> 00:27:21,789

And I don't remember the example, but

something where there's some treatment

473

00:27:21,789 --> 00:27:22,477

effect.

474

00:27:22,477 --> 00:27:25,337

which you maybe expect is positive.

475

00:27:25,337 --> 00:27:30,557

Maybe the estimate is, let's say the

estimate is 0 .3 and the standard error is

476

00:27:30,557 --> 00:27:31,957

0 .2.

477

00:27:31,957 --> 00:27:37,037

And so then I say, and it's based on 100

data points.

478

00:27:37,177 --> 00:27:42,037

So then I, so it's estimates, estimate is

0 .3, the standard error is 0 .2.

479

00:27:42,037 --> 00:27:47,757

So I'd say how large a sample would you

need to get a result that's two standard

480

00:27:47,757 --> 00:27:49,261

errors away from zero?

481

00:27:49,261 --> 00:27:53,121

That's statistically significant, a term

that I don't like to use, but of course

482

00:27:53,121 --> 00:27:55,881

they need to know how it gets used.

483

00:27:55,881 --> 00:28:01,521

So you'd say, oh well, the standard error

is 2, but really the standard error would

484

00:28:01,521 --> 00:28:05,321

have to be 1 and 1 half for it to be 2

standard errors away from 0.

485

00:28:05,321 --> 00:28:10,581

So the sample size would have to increase

by a factor of 2 divided by 1 .5 squared.

486

00:28:10,581 --> 00:28:16,221

So you take 2 over 1 .5 squared, and

that's, you know, so you can do that, you

487

00:28:16,221 --> 00:28:17,517

know, and you say here,

488

00:28:17,517 --> 00:28:24,077

2 over 1 .5 squared times 100, and that's

177.

489

00:28:24,077 --> 00:28:28,697

So you'd say, well, you need a sample size

of 177 to get your estimate to be true.

490

00:28:28,697 --> 00:28:30,817

So work that out.

491

00:28:31,017 --> 00:28:32,027

That's wrong.

492

00:28:32,027 --> 00:28:33,557

That's not the correct answer.

493

00:28:33,557 --> 00:28:38,477

Because if you redo a study with 177

people, there's no reason to think the

494

00:28:38,477 --> 00:28:40,437

point estimate will be the same.

495

00:28:40,437 --> 00:28:41,165

In fact,

496

00:28:41,165 --> 00:28:45,825

Like the whole point of saying that the

estimate is less than two standard errors

497

00:28:45,825 --> 00:28:49,385

away from zero and you don't know whether

to believe it, somehow the whole point

498

00:28:49,385 --> 00:28:53,085

from a Bayesian point of view, the point

is that it's likely to be closer to zero.

499

00:28:53,085 --> 00:28:57,465

From a classical point of view, the idea

is that you can't rule out zero as an

500

00:28:57,465 --> 00:29:01,785

explanation and zero is like typically a

privileged value there.

501

00:29:01,785 --> 00:29:06,025

So if you're replicating a study or even

doing it longer,

502

00:29:06,445 --> 00:29:11,205

you would have to, the answer depends on

the true treatment effect, not on the

503

00:29:11,205 --> 00:29:12,505

coefficient estimate.

504

00:29:12,505 --> 00:29:14,535

And well, that's harder, right?

505

00:29:14,535 --> 00:29:17,965

But the point is you can show that with a

simulation.

506

00:29:17,965 --> 00:29:23,645

If it's based on real data, it's trickier

to show because what are you doing?

507

00:29:23,645 --> 00:29:27,605

But if I then do a simulation and then I

say, well, look, let me try simulating

508

00:29:27,605 --> 00:29:28,429

100.

509

00:29:28,429 --> 00:29:31,039

with this true treatment effect and then I

see what I get.

510

00:29:31,039 --> 00:29:34,419

I say, well, shoot, I didn't get a

treatment effect of 0 .3.

511

00:29:34,419 --> 00:29:35,949

I was supposed to have to keep doing it.

512

00:29:35,949 --> 00:29:38,389

And then you realize you're selecting just

some.

513

00:29:38,389 --> 00:29:41,469

So to me, it brings it to life.

514

00:29:42,269 --> 00:29:48,389

The applied point gets demonstrated in a

way that's harder to do with just one data

515

00:29:48,389 --> 00:29:49,349

set.

516

00:29:50,429 --> 00:29:51,109

Yeah.

517

00:29:51,109 --> 00:29:52,119

Yeah, yeah, yeah.

518

00:29:52,119 --> 00:29:53,189

I really love that.

519

00:29:53,189 --> 00:29:53,519

I agree.

520

00:29:53,519 --> 00:29:57,229

And that's also something I tend to use.

521

00:29:58,093 --> 00:30:03,793

On a lot of questions people have on, you

know, A, B tests, settings, things like

522

00:30:03,793 --> 00:30:04,293

that.

523

00:30:04,293 --> 00:30:09,313

There's a lot of questions about these,

the sample size, the iteration, things

524

00:30:09,313 --> 00:30:10,313

like that.

525

00:30:10,313 --> 00:30:16,373

And I find personally, I have to do the

simulated data studies to answer these

526

00:30:16,373 --> 00:30:17,373

kinds of questions.

527

00:30:17,373 --> 00:30:21,753

Like I, I'm bad at like remembering, you

know, all those rules are awesome.

528

00:30:21,753 --> 00:30:25,993

Like, like let's do that kind of studies

with simulated data and that gives me a

529

00:30:25,993 --> 00:30:27,199

way better idea.

530

00:30:28,845 --> 00:30:34,205

So in a completely unrelated topic, I can

tell you about our two truths and a lie

531

00:30:34,205 --> 00:30:35,425

example.

532

00:30:35,425 --> 00:30:37,365

That's a demonstration we do.

533

00:30:37,365 --> 00:30:43,845

I'm mentioning that partly because writing

a book is like writing a hundred articles.

534

00:30:43,845 --> 00:30:49,185

So at one point I thought, well, maybe I

should publish these as a hundred articles

535

00:30:49,185 --> 00:30:54,965

because each story could be, well, that

just takes a lot of work and maybe more

536

00:30:54,965 --> 00:30:56,575

people will read it in book form.

537

00:30:56,575 --> 00:30:58,061

So I didn't do that, but.

538

00:30:58,061 --> 00:30:59,301

I did one of them.

539

00:30:59,301 --> 00:31:01,451

I did one or maybe I did one or two.

540

00:31:01,451 --> 00:31:03,821

It takes a while to publish an article.

541

00:31:03,821 --> 00:31:08,401

And for the bad reason that it's just

formatted in a different way, for the

542

00:31:08,401 --> 00:31:12,381

moderately good reason that you need to

explain more if it's in an article rather

543

00:31:12,381 --> 00:31:17,021

than a book because you need the context,

for the pretty good reason that you're

544

00:31:17,021 --> 00:31:22,541

forced to that, that like you have an

opportunity to expand because you have

545

00:31:22,541 --> 00:31:24,341

more space in the book.

546

00:31:24,341 --> 00:31:26,349

I can't take up too much.

547

00:31:26,349 --> 00:31:28,669

I can't have each thing take too long.

548

00:31:28,749 --> 00:31:32,289

And for the probably the biggest thing is

you get useful reviewer comments and

549

00:31:32,289 --> 00:31:33,939

people point out problems anyway.

550

00:31:33,939 --> 00:31:38,609

So the one of the the activities I did

write up as an article was two truths and

551

00:31:38,609 --> 00:31:38,979

a lie.

552

00:31:38,979 --> 00:31:42,989

And I gave a link to the article version,

which is longer than what's in the book.

553

00:31:42,989 --> 00:31:44,549

But I love the story.

554

00:31:44,549 --> 00:31:49,769

OK, is the story how it came out is that

there's this game which did not exist when

555

00:31:49,769 --> 00:31:50,569

I was a child.

556

00:31:50,569 --> 00:31:52,509

But I don't know if they do it in Europe.

557

00:31:52,509 --> 00:31:55,685

It's a big it was it's popular in.

558

00:31:55,693 --> 00:31:56,673

in the U .S.

559

00:31:56,673 --> 00:32:02,913

as the kids do it as an icebreaker in

class, you'll have a group of people and

560

00:32:02,913 --> 00:32:08,073

one person is the storyteller and this

person tells three things about

561

00:32:08,073 --> 00:32:09,073

themselves.

562

00:32:09,073 --> 00:32:13,713

Two of them have to be true and one has to

be a lie and then the other people discuss

563

00:32:13,713 --> 00:32:17,213

and try to figure out which is the truth

or which is the lie.

564

00:32:17,213 --> 00:32:21,513

So it's such a fun activity.

565

00:32:21,513 --> 00:32:24,941

I like to use it as an icebreaker in my

statistics class.

566

00:32:24,941 --> 00:32:26,841

But it has no statistics content.

567

00:32:26,841 --> 00:32:29,861

I mean, it is because there's uncertainty,

but what do you do with it?

568

00:32:29,861 --> 00:32:34,961

So I thought about and thought about and

well, I decided to put it in the second

569

00:32:34,961 --> 00:32:35,591

semester.

570

00:32:35,591 --> 00:32:40,321

I was ready for a good icebreaker and the

second semester started with logistic

571

00:32:40,321 --> 00:32:41,401

regression.

572

00:32:41,401 --> 00:32:47,661

Okay, I can make it logistic regression

problem because you can say, what's the

573

00:32:47,661 --> 00:32:50,681

probability you get it right?

574

00:32:51,021 --> 00:32:52,691

What's the probability you guess correct?

575

00:32:52,691 --> 00:32:54,605

But then you need some predictor.

576

00:32:54,605 --> 00:32:56,385

So, oh, predictor.

577

00:32:56,385 --> 00:32:59,745

Well, you can have when you guess, you

also have to give a certainty score, some

578

00:32:59,745 --> 00:33:04,245

number between zero and 10 representing

how certain you are that you're correct.

579

00:33:04,245 --> 00:33:06,465

Then it has to be done in groups.

580

00:33:06,465 --> 00:33:08,685

So I figured it out.

581

00:33:08,685 --> 00:33:11,065

Each, you divide the class into groups of

four.

582

00:33:11,065 --> 00:33:13,485

Usually we do pairs, but this one, four.

583

00:33:13,485 --> 00:33:19,905

Each group, you have one student is the

storyteller, tells the three statements.

584

00:33:19,905 --> 00:33:22,745

The other three discuss together.

585

00:33:22,905 --> 00:33:24,301

And then,

586

00:33:24,301 --> 00:33:29,441

come up with a guess of which they think

is true, which of them that they think is

587

00:33:29,441 --> 00:33:31,441

a lie, and a certainty score.

588

00:33:31,441 --> 00:33:35,161

So write the certainty score down in a

sheet of paper, then find out whether your

589

00:33:35,161 --> 00:33:37,071

guess was correct and write that down too.

590

00:33:37,071 --> 00:33:38,881

So they find out.

591

00:33:38,881 --> 00:33:41,311

Then there's four of you in the group, so

you rotate.

592

00:33:41,311 --> 00:33:43,791

Then the next person does it.

593

00:33:43,791 --> 00:33:49,069

So as a result, as a group, each group has

four certainty scores and four.

594

00:33:49,069 --> 00:33:51,129

correct or incorrect answers.

595

00:33:51,129 --> 00:33:56,009

So they have four numbers, they have eight

numbers, first four numbers between zero

596

00:33:56,009 --> 00:33:59,029

and 10, and then the four numbers which

are zeros and ones.

597

00:33:59,029 --> 00:34:05,669

And so, by the way, when you do this, I

have a slide prepared, or I write it on

598

00:34:05,669 --> 00:34:07,149

the board, the exact instructions.

599

00:34:07,149 --> 00:34:11,049

You need to give in, you can't just tell

it, people aren't paying attention for one

600

00:34:11,049 --> 00:34:11,379

second.

601

00:34:11,379 --> 00:34:14,749

I'm just doing this for you in that thing,

but actually we have the instructions

602

00:34:14,749 --> 00:34:15,437

there.

603

00:34:15,437 --> 00:34:18,357

Then did this thing I discovered a couple

of years ago.

604

00:34:18,357 --> 00:34:20,857

It's putting things on Google Forms.

605

00:34:20,857 --> 00:34:26,417

So live in class, I create a Google Form,

I open Google, type it in right there.

606

00:34:26,417 --> 00:34:29,447

So this is also it's a power thing for

them.

607

00:34:29,447 --> 00:34:30,147

Look at this.

608

00:34:30,147 --> 00:34:31,497

I didn't have to prepare this.

609

00:34:31,497 --> 00:34:36,517

I type the Google Form, I put question

one, certainty score, make it a response

610

00:34:36,517 --> 00:34:38,057

from zero to 10.

611

00:34:38,057 --> 00:34:41,377

Question two, yes or no, did you get it?

612

00:34:41,377 --> 00:34:43,053

Was your guess correct?

613

00:34:43,053 --> 00:34:47,413

So with each group, I want you to go, oh,

and then we use tiny URL to get a URL.

614

00:34:47,413 --> 00:34:51,413

And then for each group, I say, pull out

your phone or your computer, and one

615

00:34:51,413 --> 00:34:53,593

person from the group, enter your four

data points.

616

00:34:53,593 --> 00:34:55,453

So we set it up with four.

617

00:34:55,453 --> 00:34:58,753

So there's actually eight responses, the

first one, the first one.

618

00:34:58,753 --> 00:35:03,323

Then we get the data, it takes them a

minute to type it in.

619

00:35:03,323 --> 00:35:04,693

Then I have it all prepared.

620

00:35:04,693 --> 00:35:06,093

I've done it before, right?

621

00:35:06,093 --> 00:35:07,413

So I have the code ready.

622

00:35:07,413 --> 00:35:08,037

I.

623

00:35:08,173 --> 00:35:13,553

So I go to the Google page, I download it,

I put it on the desktops.

624

00:35:13,553 --> 00:35:17,273

It's not even my laptop, it's just a

computer that's in the classroom.

625

00:35:17,273 --> 00:35:22,033

Then I go, I open R, I read it in, and I

have the code prepared so I can do it.

626

00:35:22,033 --> 00:35:24,113

And then we can make graphs.

627

00:35:24,213 --> 00:35:27,773

So we fit a legit, so, but then I did

something I always like to do.

628

00:35:27,773 --> 00:35:29,203

I set it all up.

629

00:35:29,203 --> 00:35:30,701

Okay, we have the data.

630

00:35:30,701 --> 00:35:33,321

I type in the code for logistic

regression.

631

00:35:33,321 --> 00:35:34,941

Again, I have a pause.

632

00:35:34,941 --> 00:35:38,301

I say, well, write the code with your

neighbor what the logistic regression code

633

00:35:38,301 --> 00:35:39,121

would look like.

634

00:35:39,121 --> 00:35:43,921

So, yeah, and then I do it and then I type

it and I said, then I do display, you

635

00:35:43,921 --> 00:35:46,621

know, of the fitted regression.

636

00:35:47,021 --> 00:35:49,981

And before hitting carriage return, I

said, this is what it's going to look

637

00:35:49,981 --> 00:35:50,231

like.

638

00:35:50,231 --> 00:35:54,121

There's going to be coefficient estimate,

standard error.

639

00:35:54,281 --> 00:35:55,989

What are they going to be?

640

00:35:56,429 --> 00:35:59,969

You and your neighbor have to figure out,

try to guess what the estimate and the

641

00:35:59,969 --> 00:36:01,089

standard error are gonna be.

642

00:36:01,089 --> 00:36:03,729

Well, the standard error is tricky, like

that's hard.

643

00:36:03,729 --> 00:36:06,689

So I said, just figure out, guess what the

estimate will be.

644

00:36:06,689 --> 00:36:10,949

And so then I have them do it, I go around

the room, I make sure they're all drawing

645

00:36:10,949 --> 00:36:14,969

the curve, and then I have someone go on

the board and draw what they had done.

646

00:36:14,969 --> 00:36:17,689

And then I ask people, do you think this

is reasonable?

647

00:36:17,689 --> 00:36:19,209

Do you think this slope is reasonable?

648

00:36:19,209 --> 00:36:21,219

Now what do you think the standard error

will be?

649

00:36:21,219 --> 00:36:24,969

Do you think the slope will be more than

two standard errors away from zero?

650

00:36:24,969 --> 00:36:25,933

Then you fit it.

651

00:36:25,933 --> 00:36:29,253

and you have the scatter plot and they can

see and they've thought about that

652

00:36:29,253 --> 00:36:30,773

committed to it.

653

00:36:30,773 --> 00:36:32,353

So that's logistic regression.

654

00:36:32,353 --> 00:36:36,413

But when I wrote up the article, the

people in the journal said, well, what

655

00:36:36,413 --> 00:36:37,283

about other classes?

656

00:36:37,283 --> 00:36:39,473

And then I realized you can use this to

teach measurement.

657

00:36:39,473 --> 00:36:43,813

You can use it to teach experimentation,

like all sorts of things.

658

00:36:44,153 --> 00:36:46,501

You could do a lot with that.

659

00:36:46,637 --> 00:36:51,897

But I felt so satisfied because just I

felt like it was just created out of

660

00:36:51,897 --> 00:36:52,597

nothing.

661

00:36:52,597 --> 00:36:56,837

I wanted to true Snellai activity and now

there is one.

662

00:36:56,837 --> 00:37:00,317

So that was just felt so it felt so good

to have created.

663

00:37:00,317 --> 00:37:06,357

Now I want everyone to do it because now

that I created this this beautiful thing

664

00:37:06,357 --> 00:37:09,317

out of nothing, it did not exist.

665

00:37:09,317 --> 00:37:12,597

Anyway, just I'm very happy about that.

666

00:37:13,077 --> 00:37:14,577

Yeah, I love that.

667

00:37:14,577 --> 00:37:16,781

I definitely tried that in my own.

668

00:37:16,781 --> 00:37:21,841

My own classes seems like a good thing to

do on the first or second class, isn't it?

669

00:37:21,841 --> 00:37:22,741

Right, exactly.

670

00:37:22,741 --> 00:37:26,041

Now the point is that you're killing two

birds there.

671

00:37:26,921 --> 00:37:27,671

Yeah, yeah.

672

00:37:27,671 --> 00:37:28,781

No, that's super cool.

673

00:37:28,781 --> 00:37:31,701

Definitely going to try that for sure.

674

00:37:31,701 --> 00:37:34,901

So, and it's like, I have a commencement

device now.

675

00:37:34,901 --> 00:37:37,511

I have officially publicly committed to do

that.

676

00:37:37,511 --> 00:37:39,341

So I have to do it and then.

677

00:37:39,341 --> 00:37:41,941

Come back to you, Andrew, to tell you how

it went.

678

00:37:41,941 --> 00:37:46,521

The other thing you can do is there are

certain fun psychology experiments from

679

00:37:46,521 --> 00:37:51,381

the literature that can be done in class,

because things that have very large

680

00:37:51,381 --> 00:37:58,981

effects, like some of the classic Tversky,

Kahneman experiments of cognitive

681

00:37:58,981 --> 00:38:02,621

illusions, we have one of those examples

too.

682

00:38:02,621 --> 00:38:05,581

You can do it live in class.

683

00:38:08,589 --> 00:38:11,989

Yeah, that sounds also super cool.

684

00:38:12,869 --> 00:38:21,029

I also saw in preparing the episode that

you have a flipped classroom, like you

685

00:38:21,029 --> 00:38:23,289

emphasize a flipped classroom environment.

686

00:38:23,289 --> 00:38:28,089

I don't think I've ever heard you talk

about that.

687

00:38:28,749 --> 00:38:35,069

Could you explain what this approach is

and how you think that enhances the

688

00:38:35,069 --> 00:38:38,883

learning of client progression and calls

on inference?

689

00:38:38,957 --> 00:38:43,497

I think to me the flipped classroom is

pretty much the same as traditional high

690

00:38:43,497 --> 00:38:45,737

school classes, high school math class.

691

00:38:45,737 --> 00:38:51,537

So if you take math in high school, you

have a book you're supposed to read and

692

00:38:51,537 --> 00:38:52,877

there's homework assignments.

693

00:38:52,877 --> 00:38:56,017

Usually you read just enough of the book

to allow you to do the homework

694

00:38:56,017 --> 00:38:56,797

assignments.

695

00:38:56,797 --> 00:39:00,797

Then in class, the teacher does a couple

things in the board and most of the time

696

00:39:00,797 --> 00:39:05,257

in class you spend working on problems in

pairs or small groups and then people go

697

00:39:05,257 --> 00:39:07,981

up to the board and share their answers.

698

00:39:07,981 --> 00:39:10,601

That's kind of what I think should be.

699

00:39:10,601 --> 00:39:14,291

So that's the model of so it's very

traditional.

700

00:39:14,291 --> 00:39:17,961

The flipping part is, you know, I don't

have videos.

701

00:39:17,961 --> 00:39:19,491

I guess I could, but I don't.

702

00:39:19,491 --> 00:39:22,301

Akki has videos for his glasses that I

have.

703

00:39:22,301 --> 00:39:24,401

But the flip part is the reading.

704

00:39:24,401 --> 00:39:24,661

Right.

705

00:39:24,661 --> 00:39:30,581

So they I'm not lecturing because they're

supposed to have read the book.

706

00:39:30,581 --> 00:39:34,341

Now, what happens, you know, it works only

if you have a book that you can can lean

707

00:39:34,341 --> 00:39:34,551

on.

708

00:39:34,551 --> 00:39:35,901

But I think that's very important.

709

00:39:35,901 --> 00:39:37,421

This semester, I'm teaching in a

710

00:39:37,421 --> 00:39:42,081

statistics class teaching some multi

-level modeling and some other things.

711

00:39:42,081 --> 00:39:47,281

My book with Aki and Jennifer on advanced

regression and multi -level modeling

712

00:39:47,281 --> 00:39:49,641

doesn't exist yet.

713

00:39:50,021 --> 00:39:53,361

It's supposed to be the updated version of

my book with Jennifer.

714

00:39:53,361 --> 00:39:57,261

I couldn't quite bring myself to teach out

of my book with Jennifer just because the

715

00:39:57,261 --> 00:40:00,921

code is old, but then I don't have a new

book.

716

00:40:00,921 --> 00:40:03,917

And so as a result, the class I'm teaching

this semester,

717

00:40:03,917 --> 00:40:04,757

It's fun.

718

00:40:04,757 --> 00:40:10,157

I think the students are enjoying it, but

I'm not it's not going as perfectly as it

719

00:40:10,157 --> 00:40:14,877

could because I can't really do the flip

thing because I keep I end up spending a

720

00:40:14,877 --> 00:40:19,317

lot of time in class like my computer

demos typically end up being me doing the

721

00:40:19,317 --> 00:40:24,637

homeworks, working them out the homeworks

that were just do which is fine, but it's

722

00:40:24,637 --> 00:40:27,853

it's not they're a little bit more

elaborate than.

723

00:40:27,853 --> 00:40:30,193

Ideally, I think computer demos would be

shorter.

724

00:40:30,193 --> 00:40:34,193

They don't have enough to read before, so

I end up spending a lot of time lecturing.

725

00:40:34,193 --> 00:40:37,153

I think I spend most of today's class just

talking.

726

00:40:37,153 --> 00:40:39,613

I felt a little bad about that.

727

00:40:39,613 --> 00:40:40,273

I don't know.

728

00:40:40,273 --> 00:40:41,293

I think it's still fine.

729

00:40:41,293 --> 00:40:44,433

It's still a breath of fresh air compared

to other classes they're taking.

730

00:40:44,433 --> 00:40:48,333

I'm sure if all the classes were like

mine, then that would be horrible.

731

00:40:48,333 --> 00:40:52,273

But an occasional class that's like mine

can be good.

732

00:40:52,273 --> 00:40:54,763

I think in general, students like more

organization.

733

00:40:54,763 --> 00:40:56,313

A book is better.

734

00:40:56,313 --> 00:40:57,709

Even my

735

00:40:57,709 --> 00:41:02,729

My when I teach ever regression other

stories that's super organized, but it's

736

00:41:02,729 --> 00:41:07,589

not always what students want because they

want to set up methods and formulas and

737

00:41:07,589 --> 00:41:09,289

theorems and so forth.

738

00:41:09,289 --> 00:41:11,569

So I'm not always giving people what they

want.

739

00:41:11,569 --> 00:41:16,409

Anyway, I think that they again, I think

they're really looking for very clear.

740

00:41:17,249 --> 00:41:22,349

I don't I have this thing, the goal is to

be fluent in the foreign language, but I

741

00:41:22,349 --> 00:41:24,149

don't think people usually think of it

that way.

742

00:41:24,149 --> 00:41:26,861

I think that they're looking for.

743

00:41:26,861 --> 00:41:27,981

something different.

744

00:41:27,981 --> 00:41:31,861

But what that means is that it puts a

special burden on me to be super organized

745

00:41:31,861 --> 00:41:37,181

because if I'm not super organized, then I

think students will not see the point.

746

00:41:37,181 --> 00:41:39,841

So my class this semester, it doesn't use

the book.

747

00:41:39,841 --> 00:41:41,851

It's not as flipped as it could be.

748

00:41:41,851 --> 00:41:48,361

I still have them talking with each other

in class, but not having the flipped

749

00:41:48,361 --> 00:41:52,717

classroom makes it a little more of a

passive experience for them.

750

00:41:52,717 --> 00:41:55,697

And then when I do have them talking,

they're often just talking to each other

751

00:41:55,697 --> 00:41:57,897

saying, oh, I have no idea what's going on

here.

752

00:41:57,897 --> 00:42:00,877

It's like, oh, good that I know that, I

guess.

753

00:42:01,177 --> 00:42:04,377

That's true.

754

00:42:04,677 --> 00:42:04,947

Yeah.

755

00:42:04,947 --> 00:42:12,577

And I mean, I do relate to this idea of

the, you know, getting fluent in a foreign

756

00:42:12,577 --> 00:42:13,637

language.

757

00:42:13,637 --> 00:42:20,337

That's actually also a metaphor I use

quite a lot to people who are curious

758

00:42:20,337 --> 00:42:22,021

about what the...

759

00:42:22,477 --> 00:42:25,857

work of a statistical modeler is.

760

00:42:26,917 --> 00:42:33,897

And that's funny because there's that

weird human brain bias of just thinking

761

00:42:33,897 --> 00:42:40,097

that someone who is doing something that

looks hard to you, or they must have been

762

00:42:40,097 --> 00:42:42,377

good at it since the beginning.

763

00:42:43,157 --> 00:42:46,257

And at least for me, it couldn't be

further from the truth.

764

00:42:46,257 --> 00:42:48,557

It comes from a lot.

765

00:42:48,557 --> 00:42:51,565

As you were saying, I think you were

saying learning is a

766

00:42:51,565 --> 00:42:54,985

Vector is magnitude and direction, right?

767

00:42:54,985 --> 00:42:59,285

So definitely magnitude is very important

for me each time I learn something.

768

00:43:00,865 --> 00:43:09,065

And often I'm saying, yeah, well, it looks

hard because you have to learn kind of two

769

00:43:09,065 --> 00:43:14,365

languages, the language of stats and the

language, like the actual programming

770

00:43:14,365 --> 00:43:17,485

language that you need to do the stats.

771

00:43:17,485 --> 00:43:19,981

But it's just as any other language, you

need to...

772

00:43:19,981 --> 00:43:25,061

talk to people in that language and with

time you'll see your brain just getting

773

00:43:25,061 --> 00:43:26,481

there.

774

00:43:27,341 --> 00:43:36,301

So it does go through to people, but at

the same time they need to see some

775

00:43:36,301 --> 00:43:43,441

results along the way because otherwise

the motivation is gonna fall down.

776

00:43:43,801 --> 00:43:50,147

So it's always that needle that's a bit

hard to thread in my experience.

777

00:43:50,669 --> 00:43:52,289

Yeah, well, I like this book.

778

00:43:52,289 --> 00:43:55,709

See, I seriously think this book is just

fun to read.

779

00:43:55,709 --> 00:44:01,209

Although, as I said, I kind of I kind of

wish I had separated it out in a different

780

00:44:01,209 --> 00:44:05,289

way because I do feel when people when you

open at random, you end up you might see

781

00:44:05,289 --> 00:44:10,189

some code or you might see a homework

assignment or you might like it's not

782

00:44:10,189 --> 00:44:13,709

always clear what like you're not

necessarily opening into a middle of a

783

00:44:13,709 --> 00:44:14,669

story.

784

00:44:14,669 --> 00:44:17,809

And so like homework assignments don't

look like fun and code doesn't look like

785

00:44:17,809 --> 00:44:18,069

fun.

786

00:44:18,069 --> 00:44:19,657

So I'm.

787

00:44:20,301 --> 00:44:25,101

Don't think I realized you don't see the

book until it's a book before that's this

788

00:44:25,101 --> 00:44:30,941

PDF on the screen and it has it has a

different experience that way and and

789

00:44:30,941 --> 00:44:35,161

Akki's gonna kill me that I say this

because we went back and forth and and but

790

00:44:35,161 --> 00:44:41,401

like now I think we really should have of

I really think we made a mistake by not

791

00:44:41,401 --> 00:44:45,401

doing it the other way because I think it

would look a lot more fun that way if If

792

00:44:45,401 --> 00:44:50,331

like all the stories were in one place and

all the activities were in another place

793

00:44:50,413 --> 00:44:53,883

I'm really feeling bad about that.

794

00:44:53,883 --> 00:44:54,833

I still love it.

795

00:44:54,833 --> 00:44:58,453

It's just, we just have so many fun

things.

796

00:44:58,453 --> 00:45:05,773

Oh, then we have, for the final exam, we

made, it's multiple choice.

797

00:45:05,773 --> 00:45:12,353

So what I do is I have four or more

questions per chapter.

798

00:45:12,353 --> 00:45:15,309

It's like, it's, it's,

799

00:45:15,309 --> 00:45:19,159

The exam has so there's 12 chapters for

the fall and 12 for the spring.

800

00:45:19,159 --> 00:45:21,589

So each chapter, I have four or more

questions.

801

00:45:21,589 --> 00:45:25,809

What I do is I randomly sample one per

chapter and give that to the students as

802

00:45:25,809 --> 00:45:27,209

their practice exam.

803

00:45:27,209 --> 00:45:31,869

Then I randomly sample two per chapter and

give that and make that the final exam.

804

00:45:31,869 --> 00:45:36,289

So therefore, by construction, the

practice exam is representative of the

805

00:45:36,289 --> 00:45:40,009

final exam because they're two random

samples from the same population.

806

00:45:40,009 --> 00:45:44,109

So I think that's that that's great to be

able to do that now.

807

00:45:44,109 --> 00:45:50,909

Of course, all the problems are now in the

book, although without the answers.

808

00:45:50,909 --> 00:45:53,769

So you'd have to figure out which it is.

809

00:45:53,769 --> 00:45:56,469

But in theory, someone could read through

all of those.

810

00:45:56,469 --> 00:45:59,729

But of course, the usual story is if

someone really goes to the trouble of

811

00:45:59,729 --> 00:46:03,089

reading through all of them and figuring

them all out, that's probably good anyway.

812

00:46:03,089 --> 00:46:07,169

So I don't mind if they didn't do well on

the exam.

813

00:46:08,141 --> 00:46:10,621

But it took a lot of effort to write.

814

00:46:10,621 --> 00:46:14,601

These multiple choice questions are hard

to write, but I think they're easier to

815

00:46:14,601 --> 00:46:15,361

grade.

816

00:46:15,361 --> 00:46:19,521

And I think they're testing something

that's a bit more focused.

817

00:46:19,521 --> 00:46:23,021

It's very easy to write open -ended

questions and not know what you're

818

00:46:23,021 --> 00:46:24,241

testing.

819

00:46:25,681 --> 00:46:26,821

True.

820

00:46:27,101 --> 00:46:27,991

Yeah.

821

00:46:27,991 --> 00:46:28,141

Yeah.

822

00:46:28,141 --> 00:46:34,001

It's a bit more like astrology, where you

always find something you're satisfied

823

00:46:34,001 --> 00:46:34,829

about.

824

00:46:34,829 --> 00:46:36,519

Yeah, yeah, exactly.

825

00:46:36,519 --> 00:46:41,269

And it also encourages a certain behavior

among students to just keep writing and

826

00:46:41,269 --> 00:46:44,469

trying to like touch all the bases.

827

00:46:44,589 --> 00:46:45,529

True.

828

00:46:45,529 --> 00:46:46,509

Yeah, yeah.

829

00:46:46,509 --> 00:46:51,068

As a pure product of the French

educational system, I can tell you open

830

00:46:51,068 --> 00:46:54,469

ended questions are like my bread and

butter.

831

00:46:54,469 --> 00:46:59,209

I've been trained at that a lot.

832

00:46:59,209 --> 00:47:03,853

So if someone have to answer, like I have

a weird feeling of familiarity and that...

833

00:47:03,853 --> 00:47:07,033

At the same time, I like it and I dread

it.

834

00:47:07,033 --> 00:47:07,673

So that's what...

835

00:47:07,673 --> 00:47:13,613

Many years ago, I taught a class in France

and the students are supposed to do

836

00:47:13,613 --> 00:47:15,213

projects and it just happened.

837

00:47:15,213 --> 00:47:16,293

Yeah, everybody's busy.

838

00:47:16,293 --> 00:47:18,163

So one of the groups did, they did

nothing.

839

00:47:18,163 --> 00:47:22,293

They turned something in, which was pretty

much they had just like, it wasn't

840

00:47:22,293 --> 00:47:25,213

plagiarized, but they had just copied

stuff from the internet.

841

00:47:25,213 --> 00:47:28,753

Like, you know, they just literally copied

some images and it was essentially

842

00:47:28,753 --> 00:47:29,453

nothing.

843

00:47:29,453 --> 00:47:33,093

So I talked to the...

844

00:47:33,261 --> 00:47:37,280

The head instructor of the class, I said,

well, I want to give him a two out of 20

845

00:47:37,280 --> 00:47:37,801

on this.

846

00:47:37,801 --> 00:47:41,521

Like, I guess, you know, I, I, maybe I

don't give them zero because they wrote

847

00:47:41,521 --> 00:47:45,221

out sentence or two, but like, can I, can

I give them a two out of 20?

848

00:47:45,221 --> 00:47:47,641

He said, well, yeah, you're giving the

grade.

849

00:47:47,641 --> 00:47:51,061

I said, in the U S if you want to give

someone a low grade, you have to ask for

850

00:47:51,061 --> 00:47:55,281

permission because you're afraid they

might sue you or complain or something.

851

00:47:55,281 --> 00:47:59,081

And, but he said, no, in France, you can

give people, you know, two out of 20.

852

00:47:59,081 --> 00:48:01,501

They might even think it's a good grade.

853

00:48:02,605 --> 00:48:04,565

So it is a different...

854

00:48:04,565 --> 00:48:10,245

French system is a little more rough in

how the grading goes.

855

00:48:10,525 --> 00:48:12,165

I don't remember that.

856

00:48:12,165 --> 00:48:12,625

Yeah.

857

00:48:12,625 --> 00:48:13,275

I mean, it depends.

858

00:48:13,275 --> 00:48:16,665

I don't know at what level you're

teaching, but if you're teaching in the...

859

00:48:16,665 --> 00:48:20,165

especially in the class préparatoire, you

know, so that weird stuff we have in

860

00:48:20,165 --> 00:48:21,835

between high school and universities.

861

00:48:21,835 --> 00:48:25,045

These were graduate students.

862

00:48:25,045 --> 00:48:25,825

Yeah.

863

00:48:25,825 --> 00:48:28,045

So you can definitely do that.

864

00:48:28,165 --> 00:48:31,213

I know I was like my first philosophy...

865

00:48:31,213 --> 00:48:36,233

dissertations when I was in the class,

were absolutely a disaster.

866

00:48:36,913 --> 00:48:40,913

Um, it was, that was, I think I got four

out of 20, something like that.

867

00:48:40,913 --> 00:48:43,573

And that was not even the worst grades.

868

00:48:43,573 --> 00:48:50,233

You know how like in gymnastics, like it's

like 9 .8, 9 .9, 9 .93, like that, like

869

00:48:50,233 --> 00:48:52,273

the grading system did that.

870

00:48:52,273 --> 00:48:54,483

But statistics is, it's really hard.

871

00:48:54,483 --> 00:48:59,501

Like I think real world problems, I

wouldn't give myself.

872

00:48:59,501 --> 00:49:07,540

a 20 out of 20 in my analysis, because if

you're doing an experiment in political

873

00:49:07,540 --> 00:49:12,161

science or psychology or economics or an

observational study, everybody knows about

874

00:49:12,161 --> 00:49:16,421

identification being difficulty, but

there's a lot of other difficulties.

875

00:49:16,421 --> 00:49:21,531

So usually if you're doing a causal study,

you wanna have between person comparisons,

876

00:49:21,531 --> 00:49:25,001

or in political science or economics, it

would be called panel study.

877

00:49:25,001 --> 00:49:26,029

You wanna have...

878

00:49:26,029 --> 00:49:29,119

Ideally, you do the treatment and the

control on each person.

879

00:49:29,119 --> 00:49:31,989

But if you can't do that, you want to make

comparisons.

880

00:49:31,989 --> 00:49:36,259

That's super important, partly for

statistical efficiency and for balance.

881

00:49:36,259 --> 00:49:41,029

And it's also kind of a measurement issue

because measurements can be biased and

882

00:49:41,029 --> 00:49:43,379

biases can actually like the treatment

effect.

883

00:49:43,379 --> 00:49:47,149

The treatment can affect the measurement

bias and you can even have treatments that

884

00:49:47,149 --> 00:49:49,649

affect the measurement bias without

affecting the outcome.

885

00:49:49,649 --> 00:49:53,421

Like, it's so naive view that if you just.

886

00:49:53,421 --> 00:49:57,651

give randomly assigned treatment and

control that you have a kosher estimate,

887

00:49:57,651 --> 00:50:02,001

the causal effect, that's not really right

in general, because that assumes that the

888

00:50:02,001 --> 00:50:06,421

measurement bias doesn't vary with the

treatment, and that's often a mistake.

889

00:50:06,421 --> 00:50:11,441

So you really want to have panel structure

or repeated measurements with in -person

890

00:50:11,441 --> 00:50:12,121

designs.

891

00:50:12,121 --> 00:50:15,291

That means you want to start setting

multilevel models.

892

00:50:15,291 --> 00:50:19,881

So if you don't have a lot of observations

or a lot of groups, then your inferences

893

00:50:19,881 --> 00:50:22,701

can depend on the prior, which it really

does.

894

00:50:22,701 --> 00:50:25,491

You can't, you could act really tough and

say, oh, I'm really tough.

895

00:50:25,491 --> 00:50:28,441

I'm not using a prior, but then it just

means your inference is really noisy.

896

00:50:28,441 --> 00:50:29,971

And that's, that's not good either.

897

00:50:29,971 --> 00:50:31,181

It means you can get bad things.

898

00:50:31,181 --> 00:50:35,221

And then what predictors to include in

theory, everything should be interacted

899

00:50:35,221 --> 00:50:38,121

with everything because otherwise that can

induce bias.

900

00:50:38,121 --> 00:50:42,021

But in practice, if you do that, you have

a lot of the coefficients running around.

901

00:50:42,021 --> 00:50:48,845

So even the simplest problems are like,

like there's no right way of doing it.

902

00:50:48,845 --> 00:50:51,945

which gives me a lot of sympathy for

researchers.

903

00:50:51,945 --> 00:50:55,825

And I know here we're not talking about

like the crisis in science, but I'll say

904

00:50:55,825 --> 00:51:01,085

that like sometimes people will say that

you should pre -register your design and

905

00:51:01,085 --> 00:51:01,715

analysis.

906

00:51:01,715 --> 00:51:06,545

And I think that's great, but it's not

gonna solve a lot of problems because if I

907

00:51:06,545 --> 00:51:09,585

don't know the right analysis to do, I

don't know what I'm supposed to be pre

908

00:51:09,585 --> 00:51:10,155

-registering.

909

00:51:10,155 --> 00:51:11,885

It's really difficult.

910

00:51:11,885 --> 00:51:16,621

It's not, we can't just do better science

by just like.

911

00:51:16,621 --> 00:51:19,661

Like there's this phrase, questionable

research practices.

912

00:51:19,661 --> 00:51:23,361

Like it's not like you can just stop doing

questionable research practices and

913

00:51:23,361 --> 00:51:24,251

everything will be okay.

914

00:51:24,251 --> 00:51:25,261

It's not clear.

915

00:51:25,261 --> 00:51:28,351

Doing it right is not just the absence of

making mistakes.

916

00:51:28,351 --> 00:51:30,521

It's very difficult.

917

00:51:31,061 --> 00:51:35,201

And so when we're teaching or when you're

learning, I'll say, cause I really would

918

00:51:35,201 --> 00:51:39,441

like our book to be read by people who are

not necessarily teaching a class, but just

919

00:51:39,441 --> 00:51:42,885

want to learn the stuff that when you're

learning, there is this.

920

00:51:42,893 --> 00:51:47,873

weird thing where you have to learn the

skills and at the same time realize the

921

00:51:47,873 --> 00:51:48,893

limitations.

922

00:51:48,893 --> 00:51:51,773

And it is, it's hard to teach in that way.

923

00:51:51,773 --> 00:51:56,733

It's not like, it's easier to teach

something like physics or chemistry where

924

00:51:56,733 --> 00:51:58,473

you say, here's what we're doing.

925

00:51:58,473 --> 00:52:01,833

And then later on, we're going to tell you

why these ideas aren't correct.

926

00:52:01,833 --> 00:52:04,993

And we're going to do something more

elaborate in statistics.

927

00:52:04,993 --> 00:52:09,053

It's hard to reach that like plateau where

you say, well, here's the basics, learn

928

00:52:09,053 --> 00:52:09,813

the basics.

929

00:52:09,813 --> 00:52:11,789

Once you're learning the basics, you keep

930

00:52:11,789 --> 00:52:13,899

seeing all the problems at the same time.

931

00:52:13,899 --> 00:52:17,829

So it makes it very fun to learn, but also

challenging.

932

00:52:17,909 --> 00:52:19,309

Yeah, true.

933

00:52:19,429 --> 00:52:19,709

Yeah.

934

00:52:19,709 --> 00:52:29,709

And actually that makes me wonder, how do

you think, so for people who are going to

935

00:52:29,709 --> 00:52:36,089

use your book for teaching, so

instructors, how can they adapt the

936

00:52:36,089 --> 00:52:38,925

materials for different educational

settings like...

937

00:52:38,925 --> 00:52:43,425

such as introductory course or more

advanced courses.

938

00:52:44,425 --> 00:52:50,385

So it's set up for this class on applied

regression and causal inference.

939

00:52:50,385 --> 00:52:53,665

So if you're teaching out of regression

and other stories, it's very easy.

940

00:52:53,665 --> 00:52:56,945

It just gives you a whole template for a

two semester class.

941

00:52:56,945 --> 00:53:02,885

I've also taught a one semester version

where I just do one activity and each week

942

00:53:02,885 --> 00:53:04,025

I have two of everything.

943

00:53:04,025 --> 00:53:07,525

So instead I just pick one story, one

activity and so forth.

944

00:53:07,525 --> 00:53:08,429

That's what actually I did.

945

00:53:08,429 --> 00:53:15,509

Last semester, if it's a more advanced

class, and I would say, or or more basic,

946

00:53:15,509 --> 00:53:18,809

if it's a more basic class, I think it's

still pretty much works.

947

00:53:18,809 --> 00:53:22,609

You just have to simplify the code

demonstrations are going to be way too

948

00:53:22,609 --> 00:53:24,509

complicated for more basic class.

949

00:53:24,509 --> 00:53:27,849

But I think the stories work and the

activities work.

950

00:53:27,849 --> 00:53:31,029

You just maybe have to change it a little.

951

00:53:31,029 --> 00:53:31,685

So.

952

00:53:31,757 --> 00:53:36,857

In two truths and a lie, you wouldn't do

logistic regression, but for example, you

953

00:53:36,857 --> 00:53:41,837

could still make a scatter plot and you

could still compare the probability, the

954

00:53:41,837 --> 00:53:46,137

proportion of correct guesses for people's

certainty scores higher than five or lower

955

00:53:46,137 --> 00:53:47,057

than five.

956

00:53:47,057 --> 00:53:48,877

You can adapt it.

957

00:53:48,877 --> 00:53:53,837

I think a lot of the activities are like

that in the stories.

958

00:53:53,837 --> 00:53:58,857

For more advanced class, I think again, it

works in the other direction that this can

959

00:53:58,857 --> 00:54:00,017

be a starting point.

960

00:54:00,017 --> 00:54:01,805

You give the story and...

961

00:54:01,805 --> 00:54:04,385

And also people have their own stories.

962

00:54:04,385 --> 00:54:09,844

So reading my story might help you as a

teacher, think of your own story and tell

963

00:54:09,844 --> 00:54:11,305

it in the same way.

964

00:54:11,305 --> 00:54:13,245

Yeah.

965

00:54:13,245 --> 00:54:13,575

Okay.

966

00:54:13,575 --> 00:54:15,365

Yeah, I see what you mean.

967

00:54:15,865 --> 00:54:19,025

I'm thinking randomly.

968

00:54:19,025 --> 00:54:27,425

It sounds like you would be interested in

Andrew at some point in writing some

969

00:54:27,425 --> 00:54:31,725

fictional stats -based stories.

970

00:54:31,725 --> 00:54:37,725

something like, I think Carl Sagan, right,

did write some science fiction.

971

00:54:37,725 --> 00:54:42,125

Would you be like, do you see yourself

doing that at some point so that you are

972

00:54:42,125 --> 00:54:48,745

forced to maybe not use any modeling or

things like that in the book and you have

973

00:54:48,745 --> 00:54:53,885

to completely only tell stats through the

stories and all?

974

00:54:53,885 --> 00:54:56,815

Well, well, fake data for sure.

975

00:54:56,815 --> 00:54:57,885

I did have an idea.

976

00:54:57,885 --> 00:55:01,293

I was thinking about having a book where

it's

977

00:55:01,293 --> 00:55:06,733

all like it's learning statistics through

fake data simulation where everything is

978

00:55:06,733 --> 00:55:10,353

just you just start with some very simple

things like everything that's like the

979

00:55:10,353 --> 00:55:13,993

gimmick right the gimmick is here all the

principles of probability and statistics

980

00:55:13,993 --> 00:55:18,013

and you're only you're not allowed to use

any real data you're only allowed to do

981

00:55:18,013 --> 00:55:22,953

fake data simulation and you can cover a

lot like all sorts of things the the

982

00:55:22,953 --> 00:55:27,673

attenuation of the of the code the

treatment effect when you have measurement

983

00:55:27,673 --> 00:55:29,581

error in your predictor and

984

00:55:29,581 --> 00:55:32,521

Like anyway, all sorts of things you might

want to cover.

985

00:55:32,521 --> 00:55:33,641

You could do that way.

986

00:55:33,641 --> 00:55:35,261

So I thought that would be fun.

987

00:55:35,261 --> 00:55:37,361

Maybe a fun future book.

988

00:55:37,361 --> 00:55:40,881

I mean, fiction, you know, Jessica and I

wrote a play, Jessica Holman and I wrote a

989

00:55:40,881 --> 00:55:42,441

play recursion, which is fiction.

990

00:55:42,441 --> 00:55:44,141

It has computer science theme.

991

00:55:44,141 --> 00:55:47,521

It was performed at a computer science

conference recently.

992

00:55:47,521 --> 00:55:51,881

So, so I guess, yeah, we have written

fiction.

993

00:55:51,881 --> 00:55:55,681

It didn't really have, it had some

statistical principles in there.

994

00:55:55,681 --> 00:55:58,331

There were, there were some, it had some.

995

00:55:58,797 --> 00:56:06,317

Like we, yeah, I think we had some line

where one of the characters talked about

996

00:56:06,317 --> 00:56:10,857

their code being beautiful, and then

somebody else said, code that runs is

997

00:56:10,857 --> 00:56:11,857

beautiful.

998

00:56:11,857 --> 00:56:17,277

And then somebody else says, code that

runs and you know it runs is beautiful.

999

00:56:17,277 --> 00:56:19,447

So that's like some workflow principle.

Speaker:

00:56:19,447 --> 00:56:24,797

So we were able to put in some of our

thoughts about statistical workflow in

Speaker:

00:56:24,797 --> 00:56:25,777

fiction.

Speaker:

00:56:26,377 --> 00:56:27,997

So yeah, it's possible.

Speaker:

00:56:28,301 --> 00:56:30,001

I knew it.

Speaker:

00:56:30,001 --> 00:56:31,001

I knew it.

Speaker:

00:56:31,001 --> 00:56:31,721

Yeah.

Speaker:

00:56:31,801 --> 00:56:33,921

I love to hear that.

Speaker:

00:56:33,921 --> 00:56:35,401

I love to hear that.

Speaker:

00:56:35,401 --> 00:56:37,041

Read that book.

Speaker:

00:56:37,041 --> 00:56:43,301

And I was saying here, because I think,

and you could even record the audio

Speaker:

00:56:43,301 --> 00:56:44,381

version yourself.

Speaker:

00:56:44,381 --> 00:56:45,771

I think that'd be awesome.

Speaker:

00:56:45,771 --> 00:56:45,981

Yeah.

Speaker:

00:56:45,981 --> 00:56:49,221

Well, that performance apparently went

well, but they didn't video it.

Speaker:

00:56:49,221 --> 00:56:52,561

So we want to get it performed somewhere

else.

Speaker:

00:56:52,561 --> 00:56:53,741

Yeah.

Speaker:

00:56:54,101 --> 00:56:55,461

Well, let's try that.

Speaker:

00:56:55,461 --> 00:56:56,749

If there is...

Speaker:

00:56:56,749 --> 00:57:05,289

One day if I manage to do a live LBS

dinner, that should definitely be

Speaker:

00:57:05,289 --> 00:57:08,369

performed at that dinner.

Speaker:

00:57:08,449 --> 00:57:10,929

That's a must.

Speaker:

00:57:13,369 --> 00:57:21,769

Now I'd like to ask you something about, I

know a topic that's dear to your heart is

Speaker:

00:57:21,769 --> 00:57:25,609

visualization and it's time to

understanding.

Speaker:

00:57:25,889 --> 00:57:26,605

Because...

Speaker:

00:57:26,605 --> 00:57:32,465

the focus on visualization is a key aspect

of your book, Active Statistics.

Speaker:

00:57:32,465 --> 00:57:36,745

It's also a key aspect of almost all your

work.

Speaker:

00:57:36,885 --> 00:57:39,205

So I'd like to hear your thought about

that.

Speaker:

00:57:39,205 --> 00:57:46,185

How do you think visualization aids in the

comprehension of statistics and cost of

Speaker:

00:57:46,185 --> 00:57:47,045

models?

Speaker:

00:57:47,365 --> 00:57:49,505

Well, so I'll talk about two things.

Speaker:

00:57:49,505 --> 00:57:54,381

First, visualization in teaching and

second, visualization in statistical.

Speaker:

00:57:54,381 --> 00:57:55,801

Like applied statistics.

Speaker:

00:57:55,801 --> 00:58:01,561

So with teaching, I think like I think the

deterministic part is usually the more

Speaker:

00:58:01,561 --> 00:58:02,521

important part of the model.

Speaker:

00:58:02,521 --> 00:58:05,661

So I want people to be able to visualize

what is the line?

Speaker:

00:58:05,661 --> 00:58:07,761

Why goes a plus BX?

Speaker:

00:58:07,761 --> 00:58:10,241

What what does it look like if I have an

interaction?

Speaker:

00:58:10,241 --> 00:58:12,501

What would the two lines look like?

Speaker:

00:58:12,501 --> 00:58:15,681

What is logistic curve look like?

Speaker:

00:58:16,201 --> 00:58:22,081

I I don't I think it's a mistake when

statistics books start with things like a

Speaker:

00:58:22,081 --> 00:58:23,021

histogram.

Speaker:

00:58:23,021 --> 00:58:25,761

Histogram is not fundamental.

Speaker:

00:58:25,761 --> 00:58:27,541

Actually, it's very confusing.

Speaker:

00:58:27,541 --> 00:58:35,301

I used to do this assignment where I would

say to students, gather between 30 and 50

Speaker:

00:58:35,301 --> 00:58:39,681

data points on anything and make a

histogram of it.

Speaker:

00:58:39,681 --> 00:58:42,271

And about half the students would do it.

Speaker:

00:58:42,271 --> 00:58:46,221

Like they might gather data on 30

countries or 50 states, or they might take

Speaker:

00:58:46,221 --> 00:58:49,541

30 observations of something and make a

histogram.

Speaker:

00:58:49,981 --> 00:58:52,127

The other half would.

Speaker:

00:58:52,141 --> 00:58:57,641

make a bar chart showing their 30

observations in time order.

Speaker:

00:58:57,641 --> 00:59:00,921

So it would be like, basically it was a

time series except it would just be

Speaker:

00:59:00,921 --> 00:59:03,241

displayed in bars because it was a

histogram.

Speaker:

00:59:03,241 --> 00:59:07,681

And so like you see the problem is that a

histogram is supposed to convey a

Speaker:

00:59:07,681 --> 00:59:11,041

distribution, but what people are getting

out of it is it looks like a bunch of bars

Speaker:

00:59:11,041 --> 00:59:13,341

and half the students didn't get the

point.

Speaker:

00:59:13,341 --> 00:59:16,781

The concept of a distribution is very

abstract because...

Speaker:

00:59:16,781 --> 00:59:21,621

The height of the bar represents the

number of cases or the proportion of

Speaker:

00:59:21,621 --> 00:59:22,781

cases.

Speaker:

00:59:22,781 --> 00:59:25,011

It's not like a scatter plot.

Speaker:

00:59:25,011 --> 00:59:27,041

I think it's actually more intuitive.

Speaker:

00:59:27,041 --> 00:59:31,781

But I noticed that statistics classes were

always focusing on that because, oh,

Speaker:

00:59:31,781 --> 00:59:33,251

histogram is one dimensional.

Speaker:

00:59:33,251 --> 00:59:35,011

What could be more simple than that?

Speaker:

00:59:35,011 --> 00:59:37,851

I think a time series is really much more

basic.

Speaker:

00:59:37,851 --> 00:59:42,941

So when it comes to plotting data, I think

we really have to get a little closer to

Speaker:

00:59:42,941 --> 00:59:44,525

what we care about.

Speaker:

00:59:44,525 --> 00:59:47,905

Um, a lot of just stupid stuff, like box

plots.

Speaker:

00:59:47,905 --> 00:59:48,565

I hate that.

Speaker:

00:59:48,565 --> 00:59:49,455

I hate that stuff.

Speaker:

00:59:49,455 --> 00:59:51,285

And it's like, I don't see it.

Speaker:

00:59:51,285 --> 00:59:55,725

It's just like, people just do things that

are conventional and I think are

Speaker:

00:59:55,725 --> 00:59:56,745

absolutely horrible.

Speaker:

00:59:56,745 --> 01:00:02,345

But anyway, all this focus on

distributions, I think the linear, the

Speaker:

01:00:02,345 --> 01:00:04,535

deterministic part of the model is more

important.

Speaker:

01:00:04,535 --> 01:00:07,505

And so that's what I try to convey.

Speaker:

01:00:07,565 --> 01:00:08,621

I do.

Speaker:

01:00:08,621 --> 01:00:12,841

One thing I noticed is that students will

learn stuff if it's on the homework and on

Speaker:

01:00:12,841 --> 01:00:13,411

the exam.

Speaker:

01:00:13,411 --> 01:00:17,681

They won't learn it just because it's on

the blackboard in class or in your slides.

Speaker:

01:00:17,681 --> 01:00:26,381

So I found that when I did my work, I

often make sketches of graphs.

Speaker:

01:00:26,641 --> 01:00:29,521

And so I require like I have homework

assignments where you have to make a

Speaker:

01:00:29,521 --> 01:00:32,761

sketch, sketch what you think it's going

to look like, then fit the model.

Speaker:

01:00:32,761 --> 01:00:35,521

Because if you don't ask people to do

that, they won't.

Speaker:

01:00:35,521 --> 01:00:37,965

So teaching has to be.

Speaker:

01:00:37,965 --> 01:00:42,505

Like you want people to actually practice

that kind of workflow.

Speaker:

01:00:42,805 --> 01:00:45,985

So that's then I had something else to

say, but I won't.

Speaker:

01:00:45,985 --> 01:00:49,765

We can say it another time about

statistical graphics.

Speaker:

01:00:49,885 --> 01:00:51,765

It's already kind of going on a little

bit.

Speaker:

01:00:51,765 --> 01:00:55,825

So if we ever talk about statistical

graphics again, just ask me to tell you

Speaker:

01:00:55,825 --> 01:00:59,925

what I think is this really super

important aspect of statistical graphics

Speaker:

01:00:59,925 --> 01:01:01,805

within statistical inference.

Speaker:

01:01:01,805 --> 01:01:03,705

And I'll tell you about that.

Speaker:

01:01:03,705 --> 01:01:04,725

Okay, perfect.

Speaker:

01:01:04,725 --> 01:01:06,669

Well, definitely.

Speaker:

01:01:06,669 --> 01:01:08,829

Definitely tell you.

Speaker:

01:01:08,829 --> 01:01:12,849

Do you still have time for one or two

questions or should we?

Speaker:

01:01:12,849 --> 01:01:13,429

Yeah, sure.

Speaker:

01:01:13,429 --> 01:01:15,969

I have time for one or two questions,

sure.

Speaker:

01:01:15,969 --> 01:01:16,789

Okay, awesome.

Speaker:

01:01:16,789 --> 01:01:18,849

Let's continue.

Speaker:

01:01:18,849 --> 01:01:22,809

I'm curious about that.

Speaker:

01:01:23,169 --> 01:01:30,749

How do you handle the distinction and or

the transition from regression analysis to

Speaker:

01:01:30,749 --> 01:01:31,709

causal inference?

Speaker:

01:01:31,709 --> 01:01:36,429

How do you navigate these two topics in

the classroom setting?

Speaker:

01:01:36,429 --> 01:01:41,809

to ensure that students grasp both

concepts effectively.

Speaker:

01:01:42,109 --> 01:01:43,389

So I overlap.

Speaker:

01:01:43,389 --> 01:01:48,469

So I start talking about causal inference

at the very beginning, partly because they

Speaker:

01:01:48,469 --> 01:01:49,689

can't avoid it.

Speaker:

01:01:49,689 --> 01:01:51,949

So we'll have a regression.

Speaker:

01:01:52,408 --> 01:01:57,489

Maybe you fit one of the examples we use

in regression, other stories is predicting

Speaker:

01:01:57,489 --> 01:02:00,949

from some survey, predicting earnings from

height.

Speaker:

01:02:00,949 --> 01:02:04,709

Taller people make a little bit more money

than...

Speaker:

01:02:05,101 --> 01:02:11,461

shorter people and then you can also you

can throw sex into the model and men make

Speaker:

01:02:11,461 --> 01:02:13,061

more money than women taller men.

Speaker:

01:02:13,061 --> 01:02:15,621

So you can say how do you interpret the

coefficient of height?

Speaker:

01:02:15,621 --> 01:02:19,791

Well if you're one, you know for every

inch taller you make this much more money.

Speaker:

01:02:19,791 --> 01:02:20,921

So that's not right.

Speaker:

01:02:20,921 --> 01:02:26,701

You have to say comparing two people of

the same sex one of whom is one inch

Speaker:

01:02:26,701 --> 01:02:32,921

taller than the other under the model on

average the taller person will be making

Speaker:

01:02:32,921 --> 01:02:34,061

this much more money.

Speaker:

01:02:34,061 --> 01:02:35,941

So what are the things you need to say?

Speaker:

01:02:35,941 --> 01:02:38,801

You have to say comparing, because it's

all comparative.

Speaker:

01:02:38,801 --> 01:02:40,701

There's no causal language.

Speaker:

01:02:40,701 --> 01:02:46,661

You have to say, on average, you have to

say according to the model.

Speaker:

01:02:46,661 --> 01:02:52,981

And you have to say not controlling for

blah, blah, but comparing to people who

Speaker:

01:02:52,981 --> 01:02:54,721

are the same in these other predictors.

Speaker:

01:02:54,721 --> 01:02:56,621

You're not holding everything else

constant.

Speaker:

01:02:56,621 --> 01:02:58,157

You're doing this comparison.

Speaker:

01:02:58,157 --> 01:03:01,777

So I do this, I have a drilling class

where they have to do it.

Speaker:

01:03:01,777 --> 01:03:02,937

I can then they laugh.

Speaker:

01:03:02,937 --> 01:03:07,137

It's like a joke as I say, here's a

regression, explain each coefficient of

Speaker:

01:03:07,137 --> 01:03:07,637

words.

Speaker:

01:03:07,637 --> 01:03:11,117

And they say, like, what's the coefficient

of the intercept of this model?

Speaker:

01:03:11,117 --> 01:03:14,337

It's like something I'm predicting

something as a function of time.

Speaker:

01:03:14,337 --> 01:03:18,497

So this says in the year Jesus was born,

this is well, that's the intercept right

Speaker:

01:03:18,497 --> 01:03:19,977

at year zero.

Speaker:

01:03:19,981 --> 01:03:22,081

So is that interpretable?

Speaker:

01:03:22,081 --> 01:03:23,501

Well, maybe it's interpretable.

Speaker:

01:03:23,501 --> 01:03:27,741

If you have a time series going from 1900

to 2000, maybe we're not particularly

Speaker:

01:03:27,741 --> 01:03:30,661

interested in what happened when the year

Jesus was born.

Speaker:

01:03:30,661 --> 01:03:34,261

That's a bit of an extrapolation that

implies.

Speaker:

01:03:34,381 --> 01:03:35,981

So, but same with the coefficient.

Speaker:

01:03:35,981 --> 01:03:37,421

So it's like a joke in class.

Speaker:

01:03:37,421 --> 01:03:42,361

It's a fun inside joke we have in class

that I'll ask them to explain the

Speaker:

01:03:42,361 --> 01:03:47,381

regression coefficient and they have to

say it without using the wrong language.

Speaker:

01:03:47,381 --> 01:03:48,865

And it's like,

Speaker:

01:03:48,941 --> 01:03:53,161

It's like the game you play as a kid where

like you're not like you say like you're

Speaker:

01:03:53,161 --> 01:03:54,281

not allowed to say the word no.

Speaker:

01:03:54,281 --> 01:03:55,741

If you say the word no, you lose.

Speaker:

01:03:55,741 --> 01:03:58,061

You have to figure out a way to decline.

Speaker:

01:03:58,061 --> 01:04:00,561

Will you give me your cake?

Speaker:

01:04:00,561 --> 01:04:02,921

I choose not to give you your cake.

Speaker:

01:04:02,921 --> 01:04:05,661

You know, like I choose to do something

else or whatever.

Speaker:

01:04:05,661 --> 01:04:08,441

So similarly, you're not allowed to use

this word.

Speaker:

01:04:08,441 --> 01:04:13,901

And so right away, we're introducing the

idea that causation is important.

Speaker:

01:04:13,901 --> 01:04:15,021

And.

Speaker:

01:04:15,021 --> 01:04:18,961

Then when we get the causal inference,

well, we have regression already.

Speaker:

01:04:18,961 --> 01:04:23,541

So we use that not for controlling for

things, but for adjusting for things.

Speaker:

01:04:23,541 --> 01:04:27,521

So we've already done non -causal

examples, like the survey example, where

Speaker:

01:04:27,521 --> 01:04:31,441

we adjust for differences in order to post

stratify.

Speaker:

01:04:31,441 --> 01:04:34,111

So then it fits in.

Speaker:

01:04:34,111 --> 01:04:38,961

So there's a lot of specific things about

causal inference, but we first half is we

Speaker:

01:04:38,961 --> 01:04:40,161

don't cheat at the beginning.

Speaker:

01:04:40,161 --> 01:04:42,661

We don't pretend to be causal when we're

not.

Speaker:

01:04:42,661 --> 01:04:44,749

Then when we get to causal inference,

Speaker:

01:04:44,749 --> 01:04:49,009

We make use of what we've already done

rather than treating it as an entirely new

Speaker:

01:04:49,009 --> 01:04:49,969

topic.

Speaker:

01:04:49,969 --> 01:04:56,309

My little particular pet thing is that the

usual way causal inference is taught is

Speaker:

01:04:56,309 --> 01:04:58,469

there's an outcome and a treatment.

Speaker:

01:04:58,469 --> 01:05:00,868

And some people get the treatment, some

get the control.

Speaker:

01:05:00,868 --> 01:05:05,379

I say the basic is there's pre -test

measurement, a treatment, and an outcome,

Speaker:

01:05:05,379 --> 01:05:06,729

and that's in time order.

Speaker:

01:05:06,729 --> 01:05:08,249

So it introduces time.

Speaker:

01:05:08,249 --> 01:05:11,169

You don't have to have a pre -test, but

you should.

Speaker:

01:05:11,169 --> 01:05:13,805

And so it's good practice, but also it...

Speaker:

01:05:13,805 --> 01:05:18,545

It puts you into the regression framework

already, which is helpful.

Speaker:

01:05:19,285 --> 01:05:22,505

So sometimes things that are too simple

are harder to understand.

Speaker:

01:05:22,505 --> 01:05:25,005

A little context can help.

Speaker:

01:05:25,805 --> 01:05:31,265

Yeah, I found so the...

Speaker:

01:05:31,265 --> 01:05:40,205

The Dirichlet graphs do help quite a lot

in teaching the causal inference concepts,

Speaker:

01:05:40,225 --> 01:05:43,237

especially because you can then...

Speaker:

01:05:43,277 --> 01:05:47,237

marry that with the graphical

representation of the Bayesian model that

Speaker:

01:05:47,237 --> 01:05:48,557

you can come up with.

Speaker:

01:05:48,557 --> 01:05:50,717

And then you use simulated data.

Speaker:

01:05:50,717 --> 01:05:56,257

You can come up with the model, then write

the model, and then just simulate data and

Speaker:

01:05:56,257 --> 01:05:58,327

see what the model tells you.

Speaker:

01:05:58,327 --> 01:06:04,597

And if it's able to recover the true

parameters, I find these fit pretty well

Speaker:

01:06:04,597 --> 01:06:06,283

together in the workflow.

Speaker:

01:06:08,525 --> 01:06:09,505

Good.

Speaker:

01:06:09,505 --> 01:06:14,685

Yeah, I think there's a lot of different

ways of teaching these things and using

Speaker:

01:06:14,685 --> 01:06:15,525

these.

Speaker:

01:06:15,525 --> 01:06:19,465

There are different frameworks that can

work well.

Speaker:

01:06:19,465 --> 01:06:23,005

And I think that's good that that's the

case.

Speaker:

01:06:23,005 --> 01:06:28,045

There's more than one way of explaining

things and understanding things.

Speaker:

01:06:28,045 --> 01:06:30,025

Yeah, true.

Speaker:

01:06:30,345 --> 01:06:37,389

Actually, I'm curious, based on the

methodologies and...

Speaker:

01:06:37,389 --> 01:06:43,369

Also, the philosophies that present in

active statistics, how do you see the

Speaker:

01:06:43,369 --> 01:06:49,289

future of statistical education evolving,

particularly with the advent of new

Speaker:

01:06:49,289 --> 01:06:50,609

technologies?

Speaker:

01:06:50,669 --> 01:06:54,429

And how do you see that play out in the

coming years?

Speaker:

01:06:54,449 --> 01:06:55,689

I don't know.

Speaker:

01:06:55,689 --> 01:07:00,529

I mean, I'm still unhappy with how

statistics is usually taught.

Speaker:

01:07:00,529 --> 01:07:03,853

So introductory statistics, it's really

been...

Speaker:

01:07:03,853 --> 01:07:08,653

Like the textbooks now are almost all

pretty much the same as the textbooks from

Speaker:

01:07:08,653 --> 01:07:09,813

40 years ago.

Speaker:

01:07:09,813 --> 01:07:17,113

I mean, they look different, but it's

based on this thing where they teach, like

Speaker:

01:07:17,113 --> 01:07:21,753

there is this, they teach these

distributions and it, so it starts by

Speaker:

01:07:21,753 --> 01:07:27,013

focusing on variation, which I think is

not even really quite right.

Speaker:

01:07:27,013 --> 01:07:30,653

And then, it's not really focusing on the

questions that are being asked, it's

Speaker:

01:07:30,653 --> 01:07:32,397

really focused on the error term.

Speaker:

01:07:32,397 --> 01:07:37,617

And then there's all this stuff about the

sampling distribution of the sample mean,

Speaker:

01:07:37,617 --> 01:07:38,957

which is just kind of weird.

Speaker:

01:07:38,957 --> 01:07:43,537

Nobody cares about the sample mean and or

rarely do.

Speaker:

01:07:43,537 --> 01:07:46,597

It becomes very abstract and hard to

follow.

Speaker:

01:07:46,597 --> 01:07:50,897

And then there are these like confidence

intervals, like a huge amount of work to

Speaker:

01:07:50,897 --> 01:07:55,177

create these little summaries that you

don't really want to be using along with a

Speaker:

01:07:55,177 --> 01:07:56,057

bunch of messages.

Speaker:

01:07:56,057 --> 01:07:58,297

If you don't have random assignment,

you're screwed.

Speaker:

01:07:58,297 --> 01:08:00,653

If you don't have random sampling, you're

screwed.

Speaker:

01:08:00,653 --> 01:08:04,413

Then at the end, there's some stuff like

regression and Chi -squared tests and

Speaker:

01:08:04,413 --> 01:08:06,013

things that people do.

Speaker:

01:08:06,013 --> 01:08:08,672

And it's just kind of a disaster.

Speaker:

01:08:08,672 --> 01:08:09,853

I really, I really hate it.

Speaker:

01:08:09,853 --> 01:08:14,363

And I, I would like things to be much more

focused on the questions being asked.

Speaker:

01:08:14,363 --> 01:08:18,473

It's hard for me to think exactly how to

construct the introductory class to do

Speaker:

01:08:18,473 --> 01:08:19,133

this.

Speaker:

01:08:19,133 --> 01:08:22,853

But for the second class in statistics,

like the one that we teach on applied

Speaker:

01:08:22,853 --> 01:08:26,853

regression and causal inference, I do like

how we do it in regression and other

Speaker:

01:08:26,853 --> 01:08:27,273

stories.

Speaker:

01:08:27,273 --> 01:08:30,221

I feel like we developed through the

models.

Speaker:

01:08:30,221 --> 01:08:32,901

in a way that makes sense.

Speaker:

01:08:33,061 --> 01:08:35,441

I try to do that in active statistics.

Speaker:

01:08:35,441 --> 01:08:40,881

But really, the most important part of

teaching are the most basic classes.

Speaker:

01:08:42,221 --> 01:08:47,060

And there, we're still working on how to

do that.

Speaker:

01:08:47,121 --> 01:08:51,201

So I don't really know what the future is.

Speaker:

01:08:51,201 --> 01:08:57,181

There's a lot of statistics and machine

learning methods out there, but a lot

Speaker:

01:08:57,181 --> 01:08:57,925

of...

Speaker:

01:08:58,189 --> 01:09:02,369

basic concepts, of course, are still

coming up no matter how you do it, like

Speaker:

01:09:02,369 --> 01:09:07,509

issues of adjustment and bias and

variation.

Speaker:

01:09:07,829 --> 01:09:12,069

So it's hard, it is hard to get it all

like feel like it's all in one place.

Speaker:

01:09:12,069 --> 01:09:12,889

It's frustrating.

Speaker:

01:09:12,889 --> 01:09:13,549

Yeah.

Speaker:

01:09:13,549 --> 01:09:14,229

Yeah.

Speaker:

01:09:14,229 --> 01:09:14,749

Yeah.

Speaker:

01:09:14,749 --> 01:09:15,869

Now I agree with that.

Speaker:

01:09:15,869 --> 01:09:20,569

I'm also asking the question because I'm

pretty curious about it because I'm also

Speaker:

01:09:20,569 --> 01:09:24,169

personally a bit lost when I start

thinking about these things.

Speaker:

01:09:24,169 --> 01:09:25,389

It's so cute.

Speaker:

01:09:25,389 --> 01:09:25,901

And, uh,

Speaker:

01:09:25,901 --> 01:09:31,921

Like for now, I don't have a clear

organization in my head, you know.

Speaker:

01:09:32,281 --> 01:09:38,121

Maybe one last question for you, Andrew,

before I let you go, because you've

Speaker:

01:09:38,121 --> 01:09:43,241

already been extremely generous with your

time and you know me, I could really

Speaker:

01:09:43,241 --> 01:09:44,941

interview you for like three hours, no

problem.

Speaker:

01:09:44,941 --> 01:09:46,781

I have so many questions.

Speaker:

01:09:46,901 --> 01:09:50,521

But maybe what's next for you?

Speaker:

01:09:50,521 --> 01:09:55,575

What are your coming projects in maybe in

the, in this coming year?

Speaker:

01:09:56,461 --> 01:09:58,361

Well, we're trying to finish.

Speaker:

01:09:58,361 --> 01:10:04,421

Well, Aki and I are trying to finish our

Bayesian workflow book, and we'd like to

Speaker:

01:10:04,421 --> 01:10:07,761

do our advanced regression and multilevel

models book.

Speaker:

01:10:07,761 --> 01:10:12,281

It would be fun to get recursion performed

somewhere by some university theater group

Speaker:

01:10:12,281 --> 01:10:13,261

somewhere.

Speaker:

01:10:13,261 --> 01:10:22,481

Doing this research on combining, you

know, multilevel regression and post

Speaker:

01:10:22,481 --> 01:10:24,229

-traffication and

Speaker:

01:10:24,269 --> 01:10:27,529

with sampling weights, which I think is

really important.

Speaker:

01:10:27,529 --> 01:10:31,709

And I think also this could be useful for

causal inference too, because people use

Speaker:

01:10:31,709 --> 01:10:32,729

weighting there.

Speaker:

01:10:32,729 --> 01:10:39,489

So that's probably the one project I'm

most excited about from that direction.

Speaker:

01:10:39,889 --> 01:10:42,109

And then we're trying to write.

Speaker:

01:10:42,109 --> 01:10:45,149

I have a list.

Speaker:

01:10:45,149 --> 01:10:49,089

I have on my web page, I have a list of

published, unpublished, and unwritten

Speaker:

01:10:49,089 --> 01:10:50,609

research articles.

Speaker:

01:10:50,609 --> 01:10:53,229

So the unwritten is a list of like,

Speaker:

01:10:53,229 --> 01:10:56,349

things that I want to do or write up.

Speaker:

01:10:56,349 --> 01:10:57,809

So there's a long list of that.

Speaker:

01:10:57,809 --> 01:11:00,009

I'm collaborating with an economist.

Speaker:

01:11:00,029 --> 01:11:07,809

We're trying to create a unified framework

for causal inference for panel data, which

Speaker:

01:11:07,809 --> 01:11:12,749

really includes things like before -after

studies and regression discontinuities and

Speaker:

01:11:12,749 --> 01:11:19,789

difference and difference and just regular

regression, time series.

Speaker:

01:11:19,789 --> 01:11:21,293

I have a...

Speaker:

01:11:21,293 --> 01:11:24,973

Like just as a simple example, if you're

doing linear regression, like you have a

Speaker:

01:11:24,973 --> 01:11:29,233

pretest, you regress, you condition on the

pretest, you adjust for that, really.

Speaker:

01:11:29,233 --> 01:11:34,153

But if you have a, usually things in Econ,

like things are measured with error.

Speaker:

01:11:34,153 --> 01:11:37,343

And so you won't really want to regress on

the pretest.

Speaker:

01:11:37,343 --> 01:11:40,833

What you really want to do is regress on

the latent value that the pretest is a

Speaker:

01:11:40,833 --> 01:11:41,873

measurement of.

Speaker:

01:11:41,873 --> 01:11:43,633

Well, you can do that in Stan now.

Speaker:

01:11:43,633 --> 01:11:48,393

So now in Stan, you can write these models

and do Bayesian models with latent

Speaker:

01:11:48,393 --> 01:11:49,613

variables and.

Speaker:

01:11:49,613 --> 01:11:54,773

I think there's some theoretical results

to be done to show how or see how these

Speaker:

01:11:54,773 --> 01:11:57,653

things reduce to other things in special

cases.

Speaker:

01:11:57,653 --> 01:12:02,773

It's a little related to my chickens paper

that I did a couple of years ago, which I

Speaker:

01:12:02,773 --> 01:12:04,233

really enjoyed.

Speaker:

01:12:04,233 --> 01:12:06,693

That's another story.

Speaker:

01:12:06,853 --> 01:12:11,833

The chicken story is not in the Act of

Statistics book.

Speaker:

01:12:11,933 --> 01:12:15,973

I don't think it's like there's more

stories.

Speaker:

01:12:15,973 --> 01:12:19,349

There's room for another 52 stories, I'm

sure.

Speaker:

01:12:19,725 --> 01:12:21,285

in the future.

Speaker:

01:12:22,125 --> 01:12:23,525

Yeah, for sure.

Speaker:

01:12:23,905 --> 01:12:29,825

And the, yeah, we should link to your

chicken paper, actually, in the show

Speaker:

01:12:29,825 --> 01:12:29,855

notes.

Speaker:

01:12:29,855 --> 01:12:30,945

I like the chicken paper.

Speaker:

01:12:30,945 --> 01:12:33,525

It's not the world's most readable.

Speaker:

01:12:33,525 --> 01:12:35,585

I mean, it's technical, but I like it.

Speaker:

01:12:35,585 --> 01:12:36,425

It's Bayesian.

Speaker:

01:12:36,425 --> 01:12:37,685

It's good.

Speaker:

01:12:37,945 --> 01:12:38,845

Yeah.

Speaker:

01:12:38,985 --> 01:12:43,665

Is it, are you referencing the one from

2021?

Speaker:

01:12:44,745 --> 01:12:45,125

Or is that...

Speaker:

01:12:45,125 --> 01:12:46,795

Yeah, yeah.

Speaker:

01:12:46,795 --> 01:12:48,205

Slamming the sham.

Speaker:

01:12:48,205 --> 01:12:52,164

A Bayesian model for adaptive adjustment

with noisy control data.

Speaker:

01:12:52,164 --> 01:12:56,045

Yeah, it's published in Statistics in

Medicine, which like a journal, nobody

Speaker:

01:12:56,045 --> 01:12:57,125

reads.

Speaker:

01:12:57,125 --> 01:12:58,315

But what can you do?

Speaker:

01:12:58,315 --> 01:13:00,365

I guess nobody reads any journal anymore.

Speaker:

01:13:00,365 --> 01:13:02,305

So that's fine, perhaps.

Speaker:

01:13:02,305 --> 01:13:04,665

Nobody reads anything.

Speaker:

01:13:05,265 --> 01:13:06,535

Nobody reads anything.

Speaker:

01:13:06,535 --> 01:13:08,605

They're too busy reading stuff.

Speaker:

01:13:09,605 --> 01:13:14,285

Yeah, I mean, definitely that's why it's

very good that you come on the show.

Speaker:

01:13:14,285 --> 01:13:16,845

And also that you write these books.

Speaker:

01:13:17,133 --> 01:13:21,192

I think it's extremely important because

definitely the general public doesn't read

Speaker:

01:13:21,192 --> 01:13:22,233

paper.

Speaker:

01:13:22,493 --> 01:13:27,533

I know I do read paper, but it's mainly

because I have to for my job.

Speaker:

01:13:27,533 --> 01:13:34,353

I almost never read a paper by pleasure

because it's just like, yeah, the way it's

Speaker:

01:13:34,353 --> 01:13:39,153

written is just like so dry, you know, and

I really love a story, as you were saying.

Speaker:

01:13:39,153 --> 01:13:42,973

That's also why I really love your

writings in your books, in your blog,

Speaker:

01:13:42,973 --> 01:13:46,605

because it's always wrapped.

Speaker:

01:13:46,605 --> 01:13:51,835

in a story and in a context and the papers

are mainly just, okay, this is the result,

Speaker:

01:13:51,835 --> 01:13:56,205

this is what we're doing, but it's just

too drawing to me and so I'm not reading

Speaker:

01:13:56,205 --> 01:14:00,285

that when I'm trying to just read for fun,

you know.

Speaker:

01:14:00,445 --> 01:14:04,985

But yeah, awesome, well thanks a lot

Andrew.

Speaker:

01:14:04,985 --> 01:14:11,745

I will, that being said, I will link to

this chicken paper in the show notes for

Speaker:

01:14:11,745 --> 01:14:13,665

people who want to dig deeper.

Speaker:

01:14:14,065 --> 01:14:16,013

Thank you so much Andrew for...

Speaker:

01:14:16,013 --> 01:14:20,093

again, taking the time and being on this

show.

Speaker:

01:14:20,693 --> 01:14:28,053

Two patrons will have the chance of

receiving for free a hard copy of your

Speaker:

01:14:28,053 --> 01:14:29,613

book, thanks to your editor.

Speaker:

01:14:29,613 --> 01:14:33,213

So thank you so much, Cambridge University

Press.

Speaker:

01:14:33,693 --> 01:14:39,833

And in the show notes, you will have the

links also to buy the book on the

Speaker:

01:14:39,833 --> 01:14:42,633

Cambridge University Press website.

Speaker:

01:14:42,873 --> 01:14:43,085

So...

Speaker:

01:14:43,085 --> 01:14:44,625

Go ahead and do that.

Speaker:

01:14:44,625 --> 01:14:50,605

You have a 20 % discount active until July

15, 2024.

Speaker:

01:14:51,385 --> 01:14:57,625

The code is in the show notes of these

episodes, so definitely go there.

Speaker:

01:14:57,905 --> 01:14:59,145

And By Andrew's book.

Speaker:

01:14:59,145 --> 01:15:03,725

This one is really fun and you can read it

on the beach this summer, you know, and

Speaker:

01:15:03,725 --> 01:15:09,025

then you'll have a lot of cool stories to

tell your children or at the bar at night,

Speaker:

01:15:09,025 --> 01:15:10,945

so definitely do that.

Speaker:

01:15:11,405 --> 01:15:16,245

Thanks again, Andrew, and of course,

welcome back on the show anytime you

Speaker:

01:15:16,245 --> 01:15:20,645

finish your 15 upcoming books.

Speaker:

01:15:21,565 --> 01:15:25,845

Merci encore pour l 'opportunité de parler

avec toi.

Speaker:

01:15:25,945 --> 01:15:30,865

Perfect, as you can hear, Andrew speaks

very good French.

Speaker:

01:15:34,989 --> 01:15:38,729

This has been another episode of Learning

Bayesian Statistics.

Speaker:

01:15:38,729 --> 01:15:43,689

Be sure to rate, review, and follow the

show on your favorite podcatcher, and

Speaker:

01:15:43,689 --> 01:15:48,609

visit learnbaystats .com for more

resources about today's topics, as well as

Speaker:

01:15:48,609 --> 01:15:53,349

access to more episodes to help you reach

true Bayesian state of mind.

Speaker:

01:15:53,349 --> 01:15:55,259

That's learnbaystats .com.

Speaker:

01:15:55,259 --> 01:16:00,119

Our theme music is Good Bayesian by Baba

Brinkman, fit MC Lass and Meghiraam.

Speaker:

01:16:00,119 --> 01:16:03,279

Check out his awesome work at bababrinkman

.com.

Speaker:

01:16:03,279 --> 01:16:04,429

I'm your host,

Speaker:

01:16:04,429 --> 01:16:05,429

Alex and Dora.

Speaker:

01:16:05,429 --> 01:16:09,669

You can follow me on Twitter at Alex

underscore and Dora like the country.

Speaker:

01:16:09,669 --> 01:16:14,749

You can support the show and unlock

exclusive benefits by visiting Patreon

Speaker:

01:16:14,749 --> 01:16:16,929

.com slash LearnBasedDance.

Speaker:

01:16:16,929 --> 01:16:19,389

Thank you so much for listening and for

your support.

Speaker:

01:16:19,389 --> 01:16:25,269

You're truly a good Bayesian change your

predictions after taking information and

Speaker:

01:16:25,269 --> 01:16:28,569

if you're thinking I'll be less than

amazing.

Speaker:

01:16:28,569 --> 01:16:31,725

Let's adjust those expectations.

Speaker:

01:16:31,725 --> 01:16:37,145

Let me show you how to be a good Bayesian

Change calculations after taking fresh

Speaker:

01:16:37,145 --> 01:16:43,185

data in Those predictions that your brain

is making Let's get them on a solid

Speaker:

01:16:43,185 --> 01:16:44,965

foundation

Previous post
Next post