Learning Bayesian Statistics

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag 😉

Takeaways:

  • Bob’s research focuses on corruption and political economy.
  • Measuring corruption is challenging due to the unobservable nature of the behavior.
  • The challenge of studying corruption lies in obtaining honest data.
  • Innovative survey techniques, like randomized response, can help gather sensitive data.
  • Non-traditional backgrounds can enhance statistical research perspectives.
  • Bayesian methods are particularly useful for estimating latent variables.
  • Bayesian methods shine in situations with prior information.
  • Expert surveys can help estimate uncertain outcomes effectively.
  • Bob’s novel, ‘The Bayesian Heatman,’ explores academia through a fictional lens.
  • Writing fiction can enhance academic writing skills and creativity.
  • The importance of community in statistics is emphasized, especially in the Stan community.
  • Real-time online surveys could revolutionize data collection in social science.

Chapters:

00:00 Introduction to Bayesian Statistics and Bob Kubinec

06:01 Bob’s Academic Journey and Research Focus

12:40 Measuring Corruption: Challenges and Methods

18:54 Transition from Government to Academia

26:41 The Influence of Non-Traditional Backgrounds in Statistics

34:51 Bayesian Methods in Political Science Research

42:08 Bayesian Methods in COVID Measurement

51:12 The Journey of Writing a Novel

01:00:24 The Intersection of Fiction and Academia

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang, Gary Clarke, Robert Flannery, Rasmus Hindström and Stefan.

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.

Transcript
Speaker:

Did you know there is a novel out there about...

2

00:00:08,397 --> 00:00:09,998

...Basian statistics?

3

00:00:09,998 --> 00:00:18,143

It even has a great title, The Basian Hitman, and an even greater author, Robert Kubinik.

4

00:00:18,143 --> 00:00:23,686

When I heard about that, I, of course, had to invite Bob on the show.

5

00:00:23,686 --> 00:00:31,438

An assistant professor at the University of South Carolina, Bob's research focuses on

wealth creation and democratization.

6

00:00:31,438 --> 00:00:34,238

causal inference, and Bayesian statistics.

7

00:00:34,238 --> 00:00:46,118

In this episode, Bob takes us through his fascinating journey from working in government

to pursuing a career in academia, exploring his current work on measuring corruption and

8

00:00:46,118 --> 00:00:50,078

how Bayesian methods help in estimating latent variables.

9

00:00:50,078 --> 00:00:57,728

This is Learning Bayesian Statistics, episode 119, recorded October 8, 2024.

10

00:01:01,694 --> 00:01:23,094

Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the

projects, and the people who make it possible.

11

00:01:23,094 --> 00:01:25,334

I'm your host, Alex Andorra.

12

00:01:25,334 --> 00:01:28,876

You can follow me on Twitter at alex-underscore-andorra.

13

00:01:28,876 --> 00:01:29,717

like the country.

14

00:01:29,717 --> 00:01:33,950

For any info about the show, learnbasedats.com is Laplace to be.

15

00:01:33,950 --> 00:01:41,115

Show notes, becoming a corporate sponsor, unlocking Bayesian Merch, supporting the show on

Patreon, everything is in there.

16

00:01:41,115 --> 00:01:43,047

That's learnbasedats.com.

17

00:01:43,047 --> 00:01:54,085

If you're interested in one-on-one mentorship, online courses, or statistical consulting,

feel free to reach out and book a call at topmate.io slash Alex underscore and Dora.

18

00:01:54,085 --> 00:01:55,596

See you around, folks.

19

00:01:55,596 --> 00:01:57,486

and best patient wishes to you all.

20

00:01:57,486 --> 00:02:04,552

And if today's discussion sparked ideas for your business, well, our team at Pimc Labs can

help bring them to life.

21

00:02:04,552 --> 00:02:07,013

Check us out at pimc-labs.com.

22

00:02:10,529 --> 00:02:12,130

Hello my dear patients!

23

00:02:12,130 --> 00:02:16,271

Today I want to welcome two new patrons in the Lone Base Dads family.

24

00:02:16,271 --> 00:02:22,633

Thank you so much to Rasmus Hinstrom and to the mysterious Stefan.

25

00:02:22,873 --> 00:02:25,934

Your support truly makes this show possible.

26

00:02:25,934 --> 00:02:34,697

I can't wait to talk with you in the Slack channel and I hope you will enjoy being there

and talking about everything base.

27

00:02:34,697 --> 00:02:36,878

Okay, on to the show now.

28

00:02:40,920 --> 00:02:44,913

Bob Kubinek, welcome to Learning Bayesian Statistics.

29

00:02:45,314 --> 00:02:46,430

Thank you.

30

00:02:46,430 --> 00:02:47,696

Alex, it's so great to be on here.

31

00:02:47,696 --> 00:02:49,277

Thanks so much for inviting me.

32

00:02:49,778 --> 00:02:51,019

Yeah, that's awesome.

33

00:02:51,019 --> 00:03:02,580

And I also really love how that episode came around because I discovered your book, I

mean, your novel, which is your first novel.

34

00:03:02,580 --> 00:03:07,182

We'll talk about that during the show, which is called The Bayesian Hitman.

35

00:03:07,182 --> 00:03:09,832

Of course, everybody should go and check it out.

36

00:03:09,832 --> 00:03:11,762

It will be in the show notes.

37

00:03:13,082 --> 00:03:18,782

And so I discovered that book when I was at StandCon a few weeks ago.

38

00:03:19,102 --> 00:03:24,172

And I recorded a bunch of like two, live episodes.

39

00:03:24,172 --> 00:03:27,492

And then of course, afterwards you go for a drink.

40

00:03:27,492 --> 00:03:35,082

And I was having a drink with, I think it was Francis D'Italia and Richard McGrath.

41

00:03:35,082 --> 00:03:37,002

And we started.

42

00:03:37,002 --> 00:03:42,437

Of course, we were of course talking about base stuff and I don't know why at some point

Francis mentioned your book.

43

00:03:42,437 --> 00:03:46,130

It was like, Richard, I remember.

44

00:03:46,130 --> 00:03:50,875

And I was like, wait, there is a novel about like that's based around base statistics.

45

00:03:50,875 --> 00:03:53,577

I need to have that person on the show.

46

00:03:53,577 --> 00:03:58,902

So that's how it came to my knowledge.

47

00:03:58,902 --> 00:04:03,045

And then Francis, I think tagged you on Twitter and we talked a bit.

48

00:04:03,572 --> 00:04:15,420

on on twitter and then he we are so that's definitely one of the most random episodes that

i've ever done yeah i mean that's fantastic and hopefully richard maccabee does read does

49

00:04:15,420 --> 00:04:28,659

read my novel so i'd love to get his feedback on it and he his textbook like for many

people for me it was very influential in in doing basian statistics and he's an excellent

50

00:04:28,659 --> 00:04:33,152

writer i don't know if he's written any fiction the fun thing i will say about writing a

novel

51

00:04:33,646 --> 00:04:37,306

What's unusual about it isn't actually so much that I wrote a novel.

52

00:04:37,306 --> 00:04:39,796

There actually are a fair number of novels written by academics.

53

00:04:39,796 --> 00:04:42,086

It's more that I wrote it under my actual name.

54

00:04:42,086 --> 00:04:47,506

There's some untold number of academics that are publishing under pen names.

55

00:04:48,486 --> 00:04:51,106

Yeah, which is really fun.

56

00:04:51,766 --> 00:04:55,846

So next time you see a book that's anonymous, it could be by an academic.

57

00:04:55,846 --> 00:05:01,776

For me, I use my real name because the book actually is also about academia, even about

the work that I do.

58

00:05:01,776 --> 00:05:03,106

And I wanted

59

00:05:03,174 --> 00:05:05,155

I wanted people in the field to read it.

60

00:05:05,155 --> 00:05:08,515

That was part of the fun of writing it.

61

00:05:08,515 --> 00:05:13,157

That's why it's under my actual name, not some cool made up name.

62

00:05:13,157 --> 00:05:14,067

Okay.

63

00:05:14,067 --> 00:05:15,667

Well, that's already interesting.

64

00:05:15,667 --> 00:05:19,018

I wasn't aware of that at all.

65

00:05:19,018 --> 00:05:21,019

We'll definitely get back to that.

66

00:05:21,899 --> 00:05:24,020

But that's my job here to do the teasing.

67

00:05:24,020 --> 00:05:29,081

Stay tuned for more about why Bob wrote the book.

68

00:05:29,121 --> 00:05:32,462

But first, let's talk about your

69

00:05:32,568 --> 00:05:38,542

your origin story and all that because you have a very interesting one.

70

00:05:38,542 --> 00:05:44,386

But first, as usual, can you tell us what you're doing nowadays?

71

00:05:44,727 --> 00:05:45,127

Yeah.

72

00:05:45,127 --> 00:05:49,890

So I'm assistant professor of political science at the University of South Carolina.

73

00:05:50,231 --> 00:05:51,671

I just started here.

74

00:05:51,671 --> 00:05:57,815

Before this, was at New York University Abu Dhabi in Abu Dhabi, the United Arab Emirates

for five years.

75

00:05:58,716 --> 00:06:01,294

I did my PhD at the University of Virginia.

76

00:06:01,294 --> 00:06:04,335

and I did a one-year postdoc at Princeton.

77

00:06:05,615 --> 00:06:12,237

My work here, so in political science, I'm what they call a political economist.

78

00:06:12,237 --> 00:06:20,740

I study a lot of, what I tell people is you take kind of the worst parts about political

science and economics, then you put them together, and you've got political economy.

79

00:06:20,740 --> 00:06:27,982

So it's all like analyzing things like corruption, money in politics, business influence

over politics.

80

00:06:28,334 --> 00:06:32,384

And maybe on the brighter side, I do some work on economic development.

81

00:06:32,384 --> 00:06:41,974

I have a big project right now studying entrepreneurship in developing countries and how

young people can kind of get more involved.

82

00:06:41,974 --> 00:06:44,954

But yeah, do a lot of these sort of dark money.

83

00:06:45,914 --> 00:06:52,214

the connection really to Beijing Cystics is that a lot of the stuff that I study is very,

hard to measure.

84

00:06:52,214 --> 00:06:57,430

And a lot of my work in Beijing Cystics deals with measurement and using

85

00:06:57,430 --> 00:07:04,093

models, especially latent variable models, and very difficult measurement problems.

86

00:07:05,894 --> 00:07:07,035

How do you find an estimate?

87

00:07:07,035 --> 00:07:10,036

I'm just writing a viewer response today about this.

88

00:07:10,036 --> 00:07:16,639

How do you find an estimate of something that's hard to study that you can't directly

observe that kind of best incorporates everything you know?

89

00:07:16,639 --> 00:07:20,240

And Bayesian frameworks are really, I think, best at that.

90

00:07:23,092 --> 00:07:31,086

know, roughly half of my research is sort of, let's say, statistical in nature, making new

models, especially in measurement modeling.

91

00:07:31,806 --> 00:07:37,969

And then the other half is sort of empirical, like going out into the real world and

trying to discover new things about political economy.

92

00:07:37,969 --> 00:07:39,750

That is super fun.

93

00:07:39,750 --> 00:07:44,902

So you mean that people don't self-declare that they are corrupted or corrupting someone?

94

00:07:44,902 --> 00:07:47,613

Yeah, I it.

95

00:07:48,332 --> 00:07:50,283

Yeah, yeah, it is really strange.

96

00:07:50,283 --> 00:08:01,167

And corruption is this really fun thing to study, precisely because, you know, how do you

get people to admit to it, right?

97

00:08:01,347 --> 00:08:10,361

And there's this whole, you know, field of what they call sensitive survey questions,

which is, you know, if someone has done something wrong or illegal, how can you get them

98

00:08:10,361 --> 00:08:17,838

to admit that on a survey in a way that, you know, and in some sense, it's sort of, you

talk about things that are unobservable, you can't

99

00:08:17,838 --> 00:08:26,078

it's very hard to observe someone doing something completely illegal or unethical because

sort of by definition if it's illegal they're not going to do it in front of other people

100

00:08:26,078 --> 00:08:27,418

that are not going to admit to it.

101

00:08:27,418 --> 00:08:38,838

you have this sort of you know big central problem of you know how do you determine or how

do you study social behavior that you can observe right?

102

00:08:39,038 --> 00:08:42,258

And yeah it's a complex issue.

103

00:08:42,338 --> 00:08:45,574

Yeah for sure but that sounds...

104

00:08:46,590 --> 00:08:48,171

That sounds fascinating.

105

00:08:48,752 --> 00:08:53,074

Maybe you can talk a bit about that, we'll go back to your origin story.

106

00:08:53,074 --> 00:08:56,696

Since we're talking about that, how do you do that?

107

00:08:56,696 --> 00:09:03,480

How do you make people admit that they did something wrong without having them admit it?

108

00:09:03,480 --> 00:09:06,101

Do you use hypnosis?

109

00:09:06,842 --> 00:09:08,582

What are you using, Bob?

110

00:09:10,424 --> 00:09:12,195

There's two approaches.

111

00:09:12,195 --> 00:09:14,830

One is to rely on...

112

00:09:14,830 --> 00:09:16,631

sort of government data.

113

00:09:16,912 --> 00:09:21,315

And this they do a lot in developed democracies.

114

00:09:21,596 --> 00:09:23,293

So primarily the EU and the United States.

115

00:09:23,293 --> 00:09:30,073

And that's because, you know, in these countries, they have enough regulatory capability

that they can force people to report.

116

00:09:30,073 --> 00:09:42,193

So like in the US and the EU, there are like lobbying registries, right, where every time

a business deals with a politician, there's a record of it.

117

00:09:42,734 --> 00:09:47,394

Also, even when there's not a record of things, there's so much data available.

118

00:09:47,394 --> 00:09:50,394

For example, there's a lot of work on contracting.

119

00:09:50,894 --> 00:09:59,014

people will get access to the records of all the government contracts that have been

issued in a certain country and then search through them and try to find companies that

120

00:09:59,014 --> 00:10:02,314

have political connections to certain politicians.

121

00:10:02,714 --> 00:10:08,094

And there, again, it's tricky because you can't necessarily prove that there was a corrupt

transaction or something like this.

122

00:10:08,094 --> 00:10:12,766

But let's say if you see that companies that have many more

123

00:10:12,766 --> 00:10:22,510

they have someone on the board who was a former member of parliament if they get contracts

at a higher rate relative to other companies that don't you know that that sort of that

124

00:10:22,510 --> 00:10:31,804

suggested things that's that's kinda one way of doing it and the other is through you know

trying and this is more i do which is trying to directly crop collect information about

125

00:10:31,804 --> 00:10:40,997

corruption these sorts of issues and and that's a lot about trying to assure people some

kind of anonymity confidentiality

126

00:10:42,510 --> 00:10:55,114

And I do a lot of online survey research which is better at confidentiality because if you

can fill out something on your phone or in your computer by yourself, I've done a lot of

127

00:10:55,114 --> 00:10:58,414

getting employees to talk about what their companies do.

128

00:10:59,055 --> 00:11:03,176

So they don't necessarily have to report that they themselves did a corrupt transaction.

129

00:11:03,176 --> 00:11:10,530

But do you know if your business is working with some political parties or has offered

130

00:11:10,530 --> 00:11:17,273

Do you know if your boss or CEO seems to be involved in some kind of transactions with

political parties?

131

00:11:17,853 --> 00:11:20,614

Which happens all the time in developing countries.

132

00:11:20,614 --> 00:11:23,676

Anyway, yeah.

133

00:11:23,676 --> 00:11:26,657

So that's how I've been really getting at it.

134

00:11:26,657 --> 00:11:28,237

Lately, I've been experimenting.

135

00:11:28,237 --> 00:11:32,019

I kind of mentioned sensitive survey questions, and these are really fun.

136

00:11:32,019 --> 00:11:35,921

They're basically ways of trying to encrypt a survey.

137

00:11:35,921 --> 00:11:39,732

So encryption uses keys, right?

138

00:11:40,170 --> 00:11:49,748

And so there's a method called randomized response and I actually have a blog post about

this I have a recent study that I did in Tunisia using this where essentially you use a

139

00:11:49,748 --> 00:11:56,663

key the key is the respondents birthday So you you ask the respondent?

140

00:11:57,344 --> 00:12:00,366

Are you were you born in these three months of the year?

141

00:12:01,528 --> 00:12:07,852

And then you ask them a question that combines that answer which is random right?

142

00:12:08,293 --> 00:12:09,974

with actual so you say

143

00:12:10,420 --> 00:12:17,754

If you were born, like so you say, you know, did you did you let's say give a bribe or

something or do this sensitive thing?

144

00:12:19,315 --> 00:12:24,517

The answer is like you you did that or you were born in these three months of the year.

145

00:12:24,758 --> 00:12:28,880

And the other answer is, you know, yes, I did it or something like that.

146

00:12:28,880 --> 00:12:38,645

What you do is you combine the this natural randomness from this question that's

irrelevant, but random with the actual thing you're trying to measure.

147

00:12:38,666 --> 00:12:39,936

And just like

148

00:12:40,909 --> 00:12:46,470

encryption that you use on a computer or whatever, it's actually the same process.

149

00:12:46,470 --> 00:12:50,290

Because I don't know the respondent's birthday, I don't know their true answer.

150

00:12:50,290 --> 00:13:00,070

But when I take all that data, because I know the proportion of people in the population

who are born in those three months of the year, which is roughly uniform, you can then

151

00:13:00,070 --> 00:13:04,050

back out what the population estimate is, the latent estimate.

152

00:13:04,410 --> 00:13:06,872

So this is some really clever, very

153

00:13:06,872 --> 00:13:08,523

counterintuitive methods.

154

00:13:08,523 --> 00:13:13,205

And so I'm experimenting with some of these now and yeah, doing things like that.

155

00:13:13,205 --> 00:13:13,965

But.

156

00:13:14,745 --> 00:13:18,087

This is super fun.

157

00:13:18,087 --> 00:13:22,749

Sounds a bit like also, you know, detective work.

158

00:13:23,289 --> 00:13:24,170

Yeah.

159

00:13:24,170 --> 00:13:25,510

That's really fun.

160

00:13:25,810 --> 00:13:26,180

Yeah.

161

00:13:26,180 --> 00:13:28,291

Anti-corruption research is fun.

162

00:13:28,491 --> 00:13:32,443

It's extremely difficult and difficult to get data.

163

00:13:32,443 --> 00:13:34,936

And often you're kind of guessing with like...

164

00:13:34,936 --> 00:13:38,367

what you can gather, but I do enjoy it a lot.

165

00:13:39,068 --> 00:13:39,828

Yeah.

166

00:13:39,828 --> 00:13:50,192

That makes me think a bit of a project I worked with some researchers from Dalhousie

University.

167

00:13:50,732 --> 00:14:04,738

That sounds very different, but it made me think of it because they were trying to infer

the trade of shark meat across, know, between countries.

168

00:14:04,958 --> 00:14:12,141

And, but trying to do it at the species level in countries don't have to report the

species.

169

00:14:12,141 --> 00:14:20,365

They trade, they have to report the species they fish, but not the species they, they

trade.

170

00:14:20,365 --> 00:14:24,226

And the thing is they, are some species they cannot trade.

171

00:14:24,446 --> 00:14:33,262

And so of course, they, they, they can report the, the species that the trade, the

species.

172

00:14:33,262 --> 00:14:34,873

for trade, but they don't always do it.

173

00:14:34,873 --> 00:14:40,067

so you're like, okay, if they don't do it, that mean they are trading some species that

are not supposed to?

174

00:14:40,848 --> 00:14:56,009

so actually the whole work was to try and infer from both the trade data and the lending's

data, so the fishing data, which species are actually traded by which country to which

175

00:14:56,009 --> 00:14:57,099

country.

176

00:14:57,700 --> 00:15:00,982

And so that's a bit like this where you don't have.

177

00:15:01,871 --> 00:15:05,433

The data is based on self-reports.

178

00:15:05,433 --> 00:15:14,879

The reports are not very constraining, so you have to do all that detective work and

that's where all the Bayesian methods are very powerful.

179

00:15:15,721 --> 00:15:16,661

Yeah, totally.

180

00:15:16,661 --> 00:15:20,394

That's where I've gotten lot of leverage from them in my own work.

181

00:15:20,394 --> 00:15:25,007

I think too, because most Bayesian models have a frequentist analog.

182

00:15:25,007 --> 00:15:29,410

For a lot of people, that's like, well, I don't see the difference.

183

00:15:29,762 --> 00:15:32,454

When you're running a regression model, often there really isn't, right?

184

00:15:32,454 --> 00:15:37,026

If it's like a maximum likely that simple linear regression model.

185

00:15:37,447 --> 00:15:45,621

But where the Bayesian methods really shine is when you're studying some latent quantity

and especially when you have prior information about that.

186

00:15:45,621 --> 00:15:57,298

Because when you have some, let's say, very subtle prior information, like let's say,

know, experts think that the wildlife trade isn't higher than like this threshold.

187

00:15:57,304 --> 00:15:58,234

But there's uncertainty.

188

00:15:58,234 --> 00:16:00,895

You're not sure exactly what the threshold is.

189

00:16:03,557 --> 00:16:11,150

If you do your work with Stan and everything, you can include that information basically

almost directly into the model in a way that will give you much better estimates than

190

00:16:11,150 --> 00:16:16,482

starting from some population sampling perspective.

191

00:16:16,742 --> 00:16:26,226

And that's, I'd say, my favorite projects, the ones where Bayesian approaches have been so

helpful are those where you have this subtle prior information.

192

00:16:26,382 --> 00:16:28,482

and that's where Bayes can really shine.

193

00:16:28,482 --> 00:16:30,332

The flip side, of course, is it's not easy.

194

00:16:30,332 --> 00:16:41,442

I'm sure it's not easy to do that type of modeling, and especially when you start deriving

custom models, custom distributions, it's intense.

195

00:16:41,442 --> 00:16:50,662

It can take a long time, but the answer can be just so much better than alternatives

because it's just so much more nuanced.

196

00:16:51,962 --> 00:16:55,822

I could preach on this topic for a long time.

197

00:16:58,839 --> 00:17:01,380

No, I mean, that's great.

198

00:17:01,380 --> 00:17:03,751

Although you would be preaching to the choir here.

199

00:17:03,751 --> 00:17:05,481

Yeah, which is great.

200

00:17:05,481 --> 00:17:06,682

That's wonderful.

201

00:17:07,002 --> 00:17:08,943

Yeah, for sure.

202

00:17:08,943 --> 00:17:12,584

But yeah, that just sounds super fun.

203

00:17:12,584 --> 00:17:24,439

And I'm happy that I was able also to bring up that project because that fish project,

that shark tray project will have one of the main authors on the show very soon in

204

00:17:24,439 --> 00:17:26,286

November, Aaron McNeil.

205

00:17:26,286 --> 00:17:33,906

from Dalhousie University with whom I've worked on this project and the whole team.

206

00:17:34,086 --> 00:17:37,766

So stay tuned, guys, for that episode.

207

00:17:37,766 --> 00:17:40,376

That's going to be a very fun one.

208

00:17:40,376 --> 00:17:47,586

Aaron is a very good communicator and also a very interesting person to talk to.

209

00:17:47,586 --> 00:17:50,366

So it should be a very cool episode.

210

00:17:50,406 --> 00:17:55,724

But let's get back to you, because I said that you had a...

211

00:17:55,724 --> 00:18:01,035

a very interesting origin story, very original one.

212

00:18:01,035 --> 00:18:12,589

And that's because when I, so to prepare the show, I of course stalk all my future guests,

right?

213

00:18:12,589 --> 00:18:13,959

I hope you understand.

214

00:18:15,419 --> 00:18:25,676

And while stalking you, I saw that you've transitioned from working at IBM and the US

Department of State to

215

00:18:25,676 --> 00:18:30,409

Well, now academia, which is what you just said.

216

00:18:31,290 --> 00:18:32,010

So I love that.

217

00:18:32,010 --> 00:18:34,051

How did that happen?

218

00:18:34,912 --> 00:18:38,254

Yeah, know life takes a lot of twists and turns.

219

00:18:39,135 --> 00:18:47,311

And yeah, so I'd say essentially for the first part of my career, I was very, very wanted

to work in government, especially in foreign affairs.

220

00:18:47,311 --> 00:18:53,985

you know, life, you know, life changed.

221

00:18:54,732 --> 00:18:57,023

course changes are always a bit complicated.

222

00:18:57,183 --> 00:18:59,574

did one tour with the Department in Saudi Arabia.

223

00:18:59,574 --> 00:19:05,386

And a lot of my research is in the Middle East and North Africa, so that hasn't really

changed a whole lot.

224

00:19:06,447 --> 00:19:12,290

But there are things that I loved about the State Department, and some of my colleagues,

just really amazing people.

225

00:19:12,290 --> 00:19:23,154

Personally, actually when I was finishing my master's degree at George Washington, was at

a policy school, it wasn't a program that was really emphasizing preparation for PhD.

226

00:19:23,416 --> 00:19:28,828

But I did some research there as part of my thesis sort of thing.

227

00:19:28,828 --> 00:19:34,809

And I realized I had this sort of epiphany that I actually really liked doing research.

228

00:19:35,189 --> 00:19:37,470

was this Eureka moment.

229

00:19:37,770 --> 00:19:42,351

And what I also really came to value was independence.

230

00:19:42,351 --> 00:19:51,874

And I think you can probably see some of that with my academic trajectory, that I like to

work on the things that I work on, and I like to kind of take different approaches.

231

00:19:52,044 --> 00:19:54,655

And the State Department is a giant bureaucracy.

232

00:19:55,775 --> 00:20:00,637

You know, and kind of has to be its job is to implement the legislation and all that

stuff.

233

00:20:00,637 --> 00:20:09,999

And I just sort of personally realized that I would do better in a sort of less structured

environment, which, you know, there's still rules and universities and stuff, but

234

00:20:09,999 --> 00:20:13,460

relatively speaking, it's much less structured.

235

00:20:14,261 --> 00:20:19,692

And, you you work in smaller teams and your work is really more your own.

236

00:20:20,206 --> 00:20:21,827

And so I really valued that.

237

00:20:21,827 --> 00:20:22,718

So that was part of it.

238

00:20:22,718 --> 00:20:24,488

And the other part was personal.

239

00:20:24,609 --> 00:20:30,132

My fiance at the time was in the US and she was studying in a graduate program.

240

00:20:31,033 --> 00:20:33,500

the State Department, have to have worldwide availability.

241

00:20:33,500 --> 00:20:35,515

You have to go wherever they tell you to go.

242

00:20:35,555 --> 00:20:38,116

they were going to next send me to Mexico.

243

00:20:38,697 --> 00:20:42,899

That was the closest that they could send me back to the States.

244

00:20:43,020 --> 00:20:43,670

And I talked to them.

245

00:20:43,670 --> 00:20:47,598

was like, my fiance's and she's in Virginia and Richmond, US.

246

00:20:47,790 --> 00:20:49,681

and they said, yeah, we're sending you to Mexico.

247

00:20:49,681 --> 00:20:51,772

Because in the State Department world, that's close.

248

00:20:51,772 --> 00:20:56,534

You're in the same hemisphere, but it's not that close.

249

00:20:56,534 --> 00:21:02,457

And I was like, was a combination of that and then getting accepted to the PhD program at

University of Virginia.

250

00:21:02,457 --> 00:21:04,577

So was all kind of all these things combined.

251

00:21:05,619 --> 00:21:12,992

And honestly, there are definitely challenges to working in academia I'm sure you're

familiar with, but I really

252

00:21:13,094 --> 00:21:15,015

I I made the right choice, at least for me.

253

00:21:15,015 --> 00:21:18,265

I really do enjoy the freedom and flexibility of academia.

254

00:21:18,265 --> 00:21:20,078

That's probably why I've stuck with it.

255

00:21:20,938 --> 00:21:26,342

one thing I really care a lot about is open science and transparency.

256

00:21:26,922 --> 00:21:36,328

Unfortunately, those aren't always hallmarks of academia, but I do think that at least in

principle, we have the ability to be much more honest and transparent than people in

257

00:21:36,328 --> 00:21:37,288

government.

258

00:21:37,409 --> 00:21:40,350

And so that's one thing I've always really enjoyed about.

259

00:21:40,874 --> 00:21:50,923

academic research is the ability to just be upfront about what you think, release your

conclusions without a lot of political pressure to change them.

260

00:21:52,305 --> 00:21:53,285

Yeah.

261

00:21:54,207 --> 00:21:56,138

This is really interesting.

262

00:21:56,138 --> 00:22:02,213

And when I can definitely relate to that background, because it's also what happened to

me.

263

00:22:03,275 --> 00:22:06,786

was working at the French Central Bank, so it's not there.

264

00:22:06,786 --> 00:22:13,911

the Department of State, but I actually worked a bit before the Central Bank for the

French Foreign Ministry.

265

00:22:14,452 --> 00:22:16,063

Definitely had the same experience.

266

00:22:16,063 --> 00:22:18,635

Interesting.

267

00:22:18,635 --> 00:22:18,995

Okay.

268

00:22:18,995 --> 00:22:19,626

Yeah.

269

00:22:19,626 --> 00:22:20,457

Cool.

270

00:22:20,457 --> 00:22:33,271

And then, yeah, for sure, getting much more autonomy and freedom in not only what I do,

but how I do it is odd.

271

00:22:33,506 --> 00:22:44,392

Definitely something I tremendously appreciate since then and for sure I've never looked

back since that time.

272

00:22:45,813 --> 00:22:58,520

But I'm curious though, did your background in the State Department influence your

research mythologies in political science?

273

00:22:59,541 --> 00:23:01,898

much of a back and forth?

274

00:23:01,898 --> 00:23:08,240

is there between the work you did once and you once did and the work you're doing now?

275

00:23:08,981 --> 00:23:09,641

Yeah.

276

00:23:09,641 --> 00:23:14,673

I I think that where you start out has a lot of influence.

277

00:23:14,673 --> 00:23:16,024

That's where you train.

278

00:23:16,024 --> 00:23:18,665

Those are the questions you're exposed to.

279

00:23:19,798 --> 00:23:25,568

And absolutely, working at the State Department influenced a lot of the stuff I worked on

later.

280

00:23:25,568 --> 00:23:30,470

Part of it simply might be that I tend to work on sort what they call policy relevant

topics.

281

00:23:30,604 --> 00:23:34,195

most of my research is very contemporary Middle East issues.

282

00:23:35,296 --> 00:23:44,599

I have colleagues who do a lot of amazing historical research, can be really fascinating

stuff, but the long-term influence of history on the contemporary world.

283

00:23:44,599 --> 00:23:51,641

And I don't, I tend to focus more on what's kind of currently developing or happening in

different countries.

284

00:23:51,641 --> 00:23:55,640

I certainly do spend time writing

285

00:23:55,640 --> 00:24:09,323

for a policy audience so written for washington post carney down and that something in the

brookings institution so it now i i could certainly do more of that but i do and i think

286

00:24:09,323 --> 00:24:20,293

to that you probably influences my writing style which tends to kind of focus on

simplifying things in making them clear and that was certainly you know when i was when i

287

00:24:20,293 --> 00:24:21,666

was a diplomat that was

288

00:24:21,666 --> 00:24:33,030

you very much stressed because you're writing for a policymaker audience and you cannot

use lots of jargon you cannot give them ideas they can't digest in 30 seconds because

289

00:24:33,030 --> 00:24:42,180

that's all the time they have and so there's you there's a lot real like focus on being

short to the point I think I benefited from that I think it also yeah definitely

290

00:24:42,180 --> 00:24:46,302

influences the way that I do things and I think too that

291

00:24:47,714 --> 00:24:53,959

I came into my PhD program and definitely into sort of the statistical quantitative world

with a very nontraditional background.

292

00:24:53,959 --> 00:24:57,931

So the State Department is like the least quantitative place in the world.

293

00:24:57,931 --> 00:25:08,368

Like it's all diplomats who, you know, just sort of know a little bit about everything

and, you know, write, you know, pieces that just reflect their experience.

294

00:25:09,709 --> 00:25:17,184

And I think honestly, my feeling is that coming into my program with that background was

actually super helpful.

295

00:25:17,578 --> 00:25:21,622

because I wasn't sort of locked into existing paradigms.

296

00:25:21,622 --> 00:25:25,885

So I really had studied some statistics prior to grad school, but not very much.

297

00:25:25,965 --> 00:25:31,590

And so when I kind of came around to be introduced to Bayesian methods, I was just like,

this is great.

298

00:25:32,251 --> 00:25:39,717

And I think too that it's led me to question things maybe in a way that I wouldn't have

otherwise.

299

00:25:39,717 --> 00:25:45,902

And a lot of my papers and projects in the statistics world have often come out of me kind

of questioning things and

300

00:25:45,902 --> 00:25:51,182

thinking that things are unclear and wanting to know why that's the case.

301

00:25:51,342 --> 00:25:59,462

I think to, you know, there are people listening to this podcast, let's say, who are just

starting out in the world of statistics and don't have that background, you they didn't

302

00:25:59,462 --> 00:26:09,362

grow up doing, let's say, the math Olympiad, playing chess all the time or whatever the

stereotype is, you know, that's really fine.

303

00:26:09,662 --> 00:26:12,930

And there's a lot that you can contribute.

304

00:26:12,930 --> 00:26:17,533

Like I really think almost anyone can do statistics or data science.

305

00:26:17,533 --> 00:26:19,594

They're going to do it differently.

306

00:26:19,695 --> 00:26:20,948

They're going to stress different things.

307

00:26:20,948 --> 00:26:22,048

They're going to learn differently.

308

00:26:22,048 --> 00:26:23,917

They're going to communicate differently.

309

00:26:24,839 --> 00:26:26,440

But you can make a contribution.

310

00:26:26,440 --> 00:26:28,861

And part of it is just that as people, think differently.

311

00:26:28,861 --> 00:26:35,246

And if you think somewhat differently than the average statistician, that actually can be

a really good thing, especially when you're doing research.

312

00:26:35,246 --> 00:26:38,588

Because research is all about finding solutions, right?

313

00:26:38,588 --> 00:26:41,366

And if you want to find a solution, it has to be something that

314

00:26:41,366 --> 00:26:42,927

no one else has thought of yet.

315

00:26:42,927 --> 00:26:52,251

you know, I think that for me, it's actually been really fun being a statistician without

that background.

316

00:26:52,251 --> 00:26:53,991

It's also been at times intimidating.

317

00:26:53,991 --> 00:26:55,892

So you mentioned StanCon.

318

00:26:56,072 --> 00:27:08,137

I went to the first StanCon in 2018, and that was also really the first time that I had

presented at like these political methodology conferences, they're where you have a lot of

319

00:27:08,137 --> 00:27:10,388

quantitative social scientists, but

320

00:27:10,734 --> 00:27:23,654

It was definitely my first presentation that had a math stats focus, and I was just

absolutely petrified that someone like Andy Gelman was going to ask me this horrifically

321

00:27:23,654 --> 00:27:31,594

hard question about driving the analytical posterior distribution or something, and I was

just going to kind of collapse on stage.

322

00:27:31,594 --> 00:27:35,174

And that didn't happen, and I had a lot of fun.

323

00:27:36,172 --> 00:27:42,284

And I even had time to talk to some of the stand-devs and ask them these deep questions

about Hamiltonian Monte Carlo.

324

00:27:42,284 --> 00:27:44,494

And I didn't really understand the answers, but it was still fun.

325

00:27:44,494 --> 00:27:54,817

yeah, I think that that reputation of intimidation and stuff is not good for the field.

326

00:27:54,817 --> 00:27:57,828

But when people get past that, it actually can be a lot of fun.

327

00:28:00,449 --> 00:28:04,096

I definitely, yeah, I I agree with...

328

00:28:04,096 --> 00:28:11,351

everything you just said and definitely recommend people to check out events like

StandCon.

329

00:28:11,652 --> 00:28:17,036

They are really absolutely fantastic.

330

00:28:18,378 --> 00:28:20,900

As I said, I recorded two live episodes there.

331

00:28:20,900 --> 00:28:25,083

They are not out yet at the time when your episode is going to be out.

332

00:28:25,524 --> 00:28:31,548

They require a bit more editing, but they will drop in your feed, folks, in a few weeks.

333

00:28:32,826 --> 00:28:33,442

Cool.

334

00:28:33,442 --> 00:28:48,248

But yeah, that's just a fantastic experience because a lot of the Stan developers are

there and you can ask them all the questions that you want that you think are stupid, but

335

00:28:48,248 --> 00:28:50,549

actually are very interesting.

336

00:28:51,290 --> 00:28:56,992

And yeah, I get that it can be intimidating, but that's the great thing of that Beijing

community.

337

00:28:56,992 --> 00:29:01,994

From the beginning, I found that it's very welcoming.

338

00:29:02,206 --> 00:29:03,447

community.

339

00:29:03,707 --> 00:29:14,033

as you were saying, you started going into that world without a math degree, know, or an

engineering degree.

340

00:29:14,153 --> 00:29:15,394

It's the same for me.

341

00:29:15,394 --> 00:29:17,675

I studied management and political science.

342

00:29:17,675 --> 00:29:28,031

So for a long time, I was, you know, kind of fearing that that part of my background with

a lot of imposter syndrome.

343

00:29:28,031 --> 00:29:29,302

as you're saying,

344

00:29:29,492 --> 00:29:43,890

Actually, that can make you an interesting statistician because, precisely because you

haven't gone through the classic way to statistics.

345

00:29:43,890 --> 00:29:58,918

So for sure, you're not going to weigh in on the mass-saputation matrix routines if you

don't want to, because that's clearly another wheelhouse.

346

00:29:59,010 --> 00:30:13,007

you'll have a lot to say on a lot of other topics, especially applied statistics, which in

the end is extremely important because all these software are here to be applied to use

347

00:30:13,007 --> 00:30:13,957

cases.

348

00:30:15,418 --> 00:30:15,838

Yeah.

349

00:30:15,838 --> 00:30:23,402

And I think for listeners, and I'm sure you talk about this more, but if you are

interested in basic statistics in general,

350

00:30:23,656 --> 00:30:24,946

and getting into it.

351

00:30:24,946 --> 00:30:27,447

community is the place to be.

352

00:30:27,447 --> 00:30:32,488

There's a discourse site that you can post questions about using Stan.

353

00:30:32,769 --> 00:30:41,991

But if you just simply dig up a lot of the documentation that they've made, both for Stan,

but also they have these case studies online, they're excellent.

354

00:30:41,991 --> 00:30:45,692

And a lot of this has to do really with Andy Gilman.

355

00:30:46,293 --> 00:30:51,764

And Andy, just, you know, his own writing, if you read his articles, are very clear.

356

00:30:51,966 --> 00:31:02,829

And that always set him apart in the statistics world for this love of clarity, this love

of simplicity over formal notation and dense and penitential text.

357

00:31:03,269 --> 00:31:07,040

And that has really defined the Stan community as well.

358

00:31:07,040 --> 00:31:11,671

don't know if it's as much, the Stan is now so big.

359

00:31:11,671 --> 00:31:14,842

I kind of started out relatively early in it.

360

00:31:14,842 --> 00:31:21,518

And part of it was, so John Kropko was sort of my stats advisor at UVA and he had

361

00:31:21,518 --> 00:31:29,378

had did a two-year postdoc with Andy Gelman's team as they were developing the first

edition of Stan.

362

00:31:29,818 --> 00:31:39,798

So I was sort of exposed to it relatively early on and it you know back then it was the

community was small enough that it was a Google group and like you know people really like

363

00:31:39,798 --> 00:31:48,998

knew each other if they posted on there and now it's a lot bigger but I think that ethos

is still there and I you know really encourage people that you know to

364

00:31:49,154 --> 00:31:51,456

to post, to ask questions.

365

00:31:52,697 --> 00:32:02,386

Obviously, people can always be rude to newcomers and things, but it really is a great, if

you're going to start somewhere, it's a great place to start because the ethos is, how can

366

00:32:02,386 --> 00:32:04,367

we include more people?

367

00:32:04,974 --> 00:32:12,234

I just looking at the stand, they were talking about how they, I think they have like,

368

00:32:12,936 --> 00:32:20,291

They're talking about how they're measuring who goes to StanCon and who has sort of a

non-traditional background and stuff like this.

369

00:32:20,291 --> 00:32:22,082

And that's because they care about that.

370

00:32:22,082 --> 00:32:24,033

And a lot of people don't.

371

00:32:25,035 --> 00:32:26,526

So that's a beautiful thing.

372

00:32:26,526 --> 00:32:35,942

And think Stan has been responsible really for raising the level of statistical literacy

really across the entire Applied Statistics community because

373

00:32:37,194 --> 00:32:42,519

They're smart, they do amazing work, but they also explain things, right?

374

00:32:42,519 --> 00:32:51,186

And there's certain models that I know how they work because of the Stan documentation.

375

00:32:51,186 --> 00:32:55,730

Because it's just a lot clearer and to the point.

376

00:32:55,730 --> 00:32:56,930

you know, that's...

377

00:32:57,071 --> 00:33:00,274

And I think, I honestly think there's a lot of...

378

00:33:00,274 --> 00:33:05,818

There's actually a fair amount of bad vibes against Stan in the Beijing Cystics world

among people who...

379

00:33:06,359 --> 00:33:12,066

We're kind of around before it are attached to different older style methods.

380

00:33:12,066 --> 00:33:17,554

But I think part of it too is they don't like the vibe like the you know, anyone can do

this.

381

00:33:17,554 --> 00:33:21,439

We can help anyone understand that's not popular in all circles.

382

00:33:21,439 --> 00:33:22,800

I'll just say that.

383

00:33:23,302 --> 00:33:24,052

Yeah.

384

00:33:25,874 --> 00:33:28,395

Damn, yeah, that's the first time I hear that.

385

00:33:28,755 --> 00:33:30,555

But that's good to know.

386

00:33:30,555 --> 00:33:34,976

Yeah, and for sure, completely second everything you just said.

387

00:33:35,097 --> 00:33:40,448

If you're coming from the Python world, I also definitely encourage you to look at the

PaemC community.

388

00:33:40,448 --> 00:33:42,198

That's where I started.

389

00:33:42,659 --> 00:33:48,860

So the PaemC discourse is a great place to get your questions answered.

390

00:33:48,860 --> 00:33:54,702

Also answer some questions yourself because that's really how you're going to learn.

391

00:33:55,276 --> 00:34:03,207

So I definitely make use of all that, of all that open source community and make some PRs.

392

00:34:03,207 --> 00:34:06,349

PRs are always welcome, can tell you that.

393

00:34:07,990 --> 00:34:22,094

And I'm actually curious also, before we switch to your novel, to talking about your

novel, how, how...

394

00:34:23,565 --> 00:34:28,845

Like first, do you remember when you were first introduced to Bayesian stance?

395

00:34:28,845 --> 00:34:38,625

And also if these methods have shaped your research approach in political science?

396

00:34:38,865 --> 00:34:39,145

Yeah.

397

00:34:39,145 --> 00:34:39,505

Yeah.

398

00:34:39,505 --> 00:34:39,694

Good.

399

00:34:39,694 --> 00:34:39,995

Yeah.

400

00:34:39,995 --> 00:34:45,585

I should talk about like some of my actual work in this field as part of this podcast.

401

00:34:45,825 --> 00:34:46,245

Yeah.

402

00:34:46,245 --> 00:34:51,263

So it's really funny, but my first project, so I was working on this paper.

403

00:34:51,403 --> 00:34:58,044

that I had to do in grad school that was like a capstone paper for one of my minors.

404

00:34:58,044 --> 00:35:02,245

It was an applied statistic field concentration.

405

00:35:02,245 --> 00:35:05,856

I had to write this paper and I was doing mixture modeling.

406

00:35:06,696 --> 00:35:11,457

And I think the funny thing about this project is that it was in many ways like a failure.

407

00:35:12,497 --> 00:35:18,688

Because if you've ever played around with mixtures of Gaussians, mixtures of our

distributions,

408

00:35:18,808 --> 00:35:21,720

there are these horrific identification issues.

409

00:35:22,261 --> 00:35:30,319

And I was trying to fit a model where I had to identify certain clusters in the data, or I

was trying to do this with mixture modeling.

410

00:35:30,319 --> 00:35:35,013

And it just wasn't working.

411

00:35:36,074 --> 00:35:41,760

So essentially, my advisor, I was using a frequentist package, our package, called

FlexMix.

412

00:35:41,760 --> 00:35:43,121

I remember this very distinctly.

413

00:35:43,121 --> 00:35:44,386

And it was just...

414

00:35:44,386 --> 00:35:46,807

Basically, every time I ran it, I got a different result.

415

00:35:46,807 --> 00:35:50,629

And that was because of having a multimodal likelihood.

416

00:35:50,629 --> 00:35:55,572

There wasn't a single solution, and so the algorithm would end up in a different place

each time.

417

00:35:55,572 --> 00:35:57,033

And this was driving me nuts.

418

00:35:57,033 --> 00:36:02,996

So then I switched to Stan, a early version of Stan at the time.

419

00:36:03,537 --> 00:36:07,839

But there was sufficient documentation of mixture modeling.

420

00:36:07,839 --> 00:36:11,281

And the funny thing was the model didn't actually work any better in Stan.

421

00:36:12,842 --> 00:36:13,422

But...

422

00:36:13,422 --> 00:36:16,742

it was doing it in the stand that I actually understood the model.

423

00:36:16,842 --> 00:36:17,582

Right?

424

00:36:17,582 --> 00:36:22,162

And after beating my head against the wall for a few months, I was like, I understand.

425

00:36:22,162 --> 00:36:24,362

Yes, of course, this model is not identified.

426

00:36:24,362 --> 00:36:32,242

A mixture is, know, without sufficient prior information about the location of the

mixtures, like you don't know where they are.

427

00:36:32,242 --> 00:36:36,542

And so that was sort of my gateway drug.

428

00:36:37,082 --> 00:36:42,122

And then I was like, then I read McElhary's text, like when I was, because it had,

429

00:36:42,220 --> 00:36:44,111

I was just lucky, know, we all lucky.

430

00:36:44,111 --> 00:36:53,817

It came out just as I, his first edition came out just as I was starting out and I was

using Twitter, which, you know, now is let's say not as useful as it was, but the social

431

00:36:53,817 --> 00:36:54,698

media things are useful.

432

00:36:54,698 --> 00:36:58,039

I appeared on social media, people like, wow, this is really different.

433

00:36:58,039 --> 00:36:59,300

And I got it.

434

00:36:59,300 --> 00:37:01,381

And it just blew me away.

435

00:37:01,381 --> 00:37:07,484

I mean, just the clarity, the simplicity, how he connected a lot of things that I had

thought of before.

436

00:37:07,685 --> 00:37:10,166

And, you know, I think

437

00:37:10,254 --> 00:37:14,674

after reading it I went from being like, this is cool, to really being a little bit more

of a zealot.

438

00:37:14,674 --> 00:37:18,973

You know, this is how we should do research, this can really make a big difference.

439

00:37:20,474 --> 00:37:25,314

And then of course I also read, you know, like the canonical Bayesian day analysis.

440

00:37:25,514 --> 00:37:28,614

The first time I went through it, it was a little intense.

441

00:37:29,554 --> 00:37:32,894

McElroy's text is definitely more approachable.

442

00:37:34,374 --> 00:37:44,943

But then a lot of my learning was, yeah, this informal case studies, blog posts, people

sharing online, people answered my questions on the forum.

443

00:37:45,904 --> 00:37:48,526

That was all super, super helpful.

444

00:37:48,767 --> 00:37:50,808

ultimately, what I really got into...

445

00:37:51,008 --> 00:37:56,092

So I have two R packages that are Bayesian modeling.

446

00:37:56,133 --> 00:37:59,715

One is called Ord Beta Reg or Ordered Beta Regression.

447

00:37:59,796 --> 00:38:02,538

And this is a model that...

448

00:38:03,416 --> 00:38:13,692

We don't have to go all into the weeds, but traditional beta regression is supposed to be

model of proportions, but the beta distribution can't handle observations so-called at the

449

00:38:13,692 --> 00:38:20,485

bounds, meaning if your proportion goes from 0 to 1, you can include any observations that

have a 0 or a 1.

450

00:38:20,516 --> 00:38:31,141

And ordered beta regression is basically a compound model where one part is a beta

distribution and the other part is actually a simple sort of ordered logit.

451

00:38:31,874 --> 00:38:35,237

that allows the model to include those.

452

00:38:35,237 --> 00:38:39,601

So there's two cut points in the model, like an order logistic regression.

453

00:38:39,621 --> 00:38:42,554

And you have one linear model.

454

00:38:42,554 --> 00:38:49,510

you can include, basically, the model has sort of three components, zero, anything between

zero and one, and then one.

455

00:38:50,231 --> 00:38:57,877

And the payoff to the model, and it's getting a lot of exposure, is that

456

00:38:58,190 --> 00:39:00,550

It's a pretty simple model.

457

00:39:00,550 --> 00:39:08,470

not much more complicated than beta regression, but you can include an outcome that has

zeros and ones or zero and 100 or whatever your scale is.

458

00:39:08,470 --> 00:39:11,850

And that's very useful for people, especially in the social sciences.

459

00:39:12,670 --> 00:39:13,850

My other...

460

00:39:13,850 --> 00:39:14,860

And that one's available.

461

00:39:14,860 --> 00:39:20,990

It's really a wrapper around BRMS to allow people to fit these types of models.

462

00:39:21,890 --> 00:39:24,251

And that's out on Crayon and everything.

463

00:39:24,251 --> 00:39:28,772

The other one is much more ambitious and is actually just coming to fruition.

464

00:39:28,812 --> 00:39:31,193

And it's called Ideal Stan.

465

00:39:31,693 --> 00:39:42,437

And ideal there comes from a social science model called the Ideal Point Model, which is

itself a variant of something called item response theory, which people have come across

466

00:39:42,437 --> 00:39:42,537

of.

467

00:39:42,537 --> 00:39:44,177

It's from psychometrics.

468

00:39:44,417 --> 00:39:47,396

And what I've been doing is

469

00:39:47,542 --> 00:39:50,955

sort of expanding and generalizing this model using Stan.

470

00:39:50,955 --> 00:40:02,214

And what it allows people to do, and I've used it for multiple papers and publications

already, but I've never released the final version, is it allows you to apply what's

471

00:40:02,214 --> 00:40:09,540

essentially a very general purpose measurement model to all kinds of distributions of data

that people couldn't before.

472

00:40:09,540 --> 00:40:14,304

So one of the big innovations is non-ignorable missing data.

473

00:40:14,304 --> 00:40:16,576

Like when you're fitting a latent variable model,

474

00:40:16,780 --> 00:40:23,932

or any kind of measurement model, missing data is really, really tough because missing

data is essentially its own latent problem.

475

00:40:23,932 --> 00:40:27,293

When you have missing data and a latent variable, what do do?

476

00:40:27,353 --> 00:40:37,416

So this package has a way of dealing with that in a way that avoids bias in your estimate

of the latent variable.

477

00:40:37,676 --> 00:40:39,657

But it also has a bunch of other things.

478

00:40:39,657 --> 00:40:45,218

So Stan is really good at time, so there's a lot of work I've done on time-varying latent

variables.

479

00:40:46,062 --> 00:40:54,215

And that's particularly useful nowadays because as social scientists and data scientists,

we're getting so much more time-grain data, right?

480

00:40:54,215 --> 00:40:55,545

From Twitter.

481

00:40:56,245 --> 00:40:57,846

Well, I guess not anymore from Twitter.

482

00:40:57,846 --> 00:41:03,888

But there's plenty of data sets that have and can get time stamped right down into

seconds.

483

00:41:03,888 --> 00:41:11,510

And if you want to estimate some latent quality from that, like let's say you want to know

about corruption or polarization or...

484

00:41:12,352 --> 00:41:23,881

some other quantity and it's noisy, how do you handle that variation over time, especially

when you have these sparse data sets, right?

485

00:41:23,881 --> 00:41:30,055

You have only a few observations in a given time window, but your time series is really

long.

486

00:41:30,055 --> 00:41:37,331

So that's the kind of stuff I've been working on, and I'm hoping to finally release a

final version of that by the end of this year.

487

00:41:37,331 --> 00:41:39,086

Knock on wood.

488

00:41:39,086 --> 00:41:42,687

And I've already used it for a range of things.

489

00:41:42,687 --> 00:41:46,448

It's really useful for survey data sets when you have like missing data.

490

00:41:47,188 --> 00:41:55,510

Like I used it to measure essentially people's wealth from a survey when there was a lot

of missing data in different variables in the survey.

491

00:41:56,071 --> 00:42:07,373

I've used it to measure countries' policy responses at COVID-19 when there's a lot of

complexity in how they respond and which countries are the most prepared.

492

00:42:08,846 --> 00:42:11,706

And yeah, so on and so forth.

493

00:42:11,946 --> 00:42:19,526

So that package hopefully will be out soon and that uses Stan, that uses like raw Stan

code in an R framework.

494

00:42:19,526 --> 00:42:27,586

I know you're, it sounds like you're a Python guy, in principle it's, can estimate in

Python as well, but that would have to be a future project.

495

00:42:27,586 --> 00:42:32,106

But that's my, those are my big kind of Bayesian method stuff.

496

00:42:32,106 --> 00:42:37,922

I have done a, I have a new paper out, the Journal of Royal Ciscal Society that

497

00:42:37,922 --> 00:42:48,788

that fits a big Bayesian measurement models in stand to measure COVID infections in the US

in the early pandemic period.

498

00:42:48,788 --> 00:42:55,932

So, and this is talking about, you know, prior information, how useful it is in the US

context.

499

00:42:55,932 --> 00:43:01,775

Well, most countries, right, during the early part of COVID, like we didn't know, we

didn't even have bias data, right?

500

00:43:01,775 --> 00:43:06,700

Like there were so few tests available and

501

00:43:06,700 --> 00:43:13,291

I started working on this actually in sort of the early part of the pandemic and it just

was recently published, but I got really fascinated by this topic of like, well, how do

502

00:43:13,291 --> 00:43:16,793

you measure COVID infections if you just don't have any data, right?

503

00:43:16,793 --> 00:43:19,854

Like you can't test, you can't do anything.

504

00:43:19,854 --> 00:43:29,646

And it's a really cool problem from a Bayesian point of view, because as a Bayesian, you

think, well, the best answer that you can give to a problem is to include all of your

505

00:43:29,646 --> 00:43:31,477

prior information, right?

506

00:43:31,957 --> 00:43:36,098

Beyond that, that's the best answer you can give.

507

00:43:37,086 --> 00:43:45,960

And so I started to think about that from a very kind of purely, so I've told people that

this is my most Bayesian project ever because I just kind of was like sat down and I

508

00:43:45,960 --> 00:43:51,533

worked with an epidemiologist guy, Luis Carvalho, who's also a big Stan person.

509

00:43:51,533 --> 00:43:59,776

He's on the Discourse site and we worked together on this and kind of came up with a new

approach and the approach really emphasizes using priors.

510

00:43:59,776 --> 00:44:04,480

And so we show how you can get a decent estimate of COVID infections.

511

00:44:04,480 --> 00:44:10,502

in this very early period like March of 2020 by using things like expert surveys, right?

512

00:44:10,502 --> 00:44:15,259

Where you simply go and ask a bunch of experts, well, how many infections do you think

there are?

513

00:44:15,981 --> 00:44:19,784

And then you have uncertainty in that estimate, right?

514

00:44:19,784 --> 00:44:25,388

And then you allow that uncertainty to propagate into your model, into the final estimates

of COVID infections.

515

00:44:25,549 --> 00:44:31,444

And you essentially, can get a pretty good estimate that's still uncertain.

516

00:44:32,070 --> 00:44:42,283

but actually incorporates this information that that's not you know that like you don't

have tests are or you know hospitalization even but you know if if you take that basically

517

00:44:42,283 --> 00:44:51,155

take advantage of information that you have you can give a much better answer than just

spitballing or as many people did assuming way uncertainty and pretending to be much more

518

00:44:51,155 --> 00:45:00,518

certain than they were any such that i think some of my recent stuff that i'm always

tinkering you know i'm sure as you are with with different things but

519

00:45:01,696 --> 00:45:04,107

Yeah, same.

520

00:45:04,107 --> 00:45:12,779

There is always a part of me that is looking forward to the day where I will have

everything figured out and understood.

521

00:45:12,779 --> 00:45:15,466

that day never comes.

522

00:45:15,466 --> 00:45:22,072

It's like, I will finally understand Gartian processes.

523

00:45:22,072 --> 00:45:23,872

And then I think I understand.

524

00:45:23,872 --> 00:45:28,133

And then a new use case comes and I'm like, wait, how do I do that?

525

00:45:28,214 --> 00:45:29,804

Wait, why doesn't...

526

00:45:29,804 --> 00:45:31,075

Why doesn't that work?

527

00:45:31,075 --> 00:45:35,177

It's like, my God, I have to learn that new method.

528

00:45:36,038 --> 00:45:38,580

But I guess that's part of the job description.

529

00:45:39,321 --> 00:45:41,882

And that's actually the fun part, I would say.

530

00:45:42,623 --> 00:45:47,436

It's like, you just have to reassure that other voice in your hand.

531

00:45:47,436 --> 00:45:48,427

like, that's fine.

532

00:45:48,427 --> 00:45:49,057

That's normal.

533

00:45:49,057 --> 00:45:50,018

That's part of the job.

534

00:45:50,018 --> 00:45:53,390

And that's actually why it's fun and interesting.

535

00:45:53,851 --> 00:45:55,712

But congrats on all those projects.

536

00:45:55,712 --> 00:45:58,674

That sounds really cool and really fascinating.

537

00:45:59,442 --> 00:46:07,687

And interesting that, like, I didn't know about the beta, the ordered beta regression, but

definitely makes sense.

538

00:46:07,908 --> 00:46:12,271

I have some experience with the order logistic.

539

00:46:12,931 --> 00:46:22,958

I've used that most recently on a football analytics paper slash project I'm working on.

540

00:46:23,679 --> 00:46:27,921

But the ordered beta, I didn't know about that, but that sounds like fun.

541

00:46:28,430 --> 00:46:33,950

And yeah, I should say it is getting used in industry.

542

00:46:34,170 --> 00:46:42,870

I know there's a guy from Amazon who actually wanted me to make a code change so it could

be deployed somewhere in Amazon.

543

00:46:42,870 --> 00:46:44,699

So I know it's out there doing something.

544

00:46:44,699 --> 00:46:45,590

I don't know.

545

00:46:45,590 --> 00:46:49,210

I guess if your Amazon order doesn't come through, then it's my package.

546

00:46:49,210 --> 00:46:50,710

That was the problem.

547

00:46:50,710 --> 00:46:53,290

yeah, so it's getting...

548

00:46:53,290 --> 00:46:58,394

I'll be honest, I'm really, really happy with how the model is getting used in different

places.

549

00:46:59,302 --> 00:47:01,545

And essentially it really matters for predictions.

550

00:47:01,545 --> 00:47:13,195

Like if you fit your model and you want to predict, okay, given this many orders, what

proportion, or this many ads, what proportion will buy the product?

551

00:47:13,195 --> 00:47:15,577

You want that prediction to respect the bounds.

552

00:47:15,577 --> 00:47:25,115

You don't want, if you use OLS or something, then you could end up predicting that 115 %

of your customers will buy a product.

553

00:47:25,115 --> 00:47:26,765

That doesn't make any sense.

554

00:47:27,202 --> 00:47:38,364

you know the order beta regression allows you to take into account that non-linearity and

the outcome that you have these bounds and yeah and I will say for for order logit again

555

00:47:38,364 --> 00:47:47,889

you're talking about the stand documentation but Michael Betancourt who's you know kind of

a legend in the stand community but he has an amazing case study on order logit as this

556

00:47:47,889 --> 00:47:52,950

stuff does it does get somewhat technical but it's really brilliant I've never seen anyone

557

00:47:53,518 --> 00:48:00,980

sort of go through the model the way he does, but if you can get through his case study,

you really understand OrderedLogit.

558

00:48:01,159 --> 00:48:10,863

And as he does, in that case study, this is just a case study, he derives a novel

distribution for the priors in OrderedLogit model, the cut points.

559

00:48:10,863 --> 00:48:12,433

He just does this part of the case study.

560

00:48:12,433 --> 00:48:16,584

Like, yeah, here's this new Sysl distribution no one's ever used before.

561

00:48:17,545 --> 00:48:19,705

So yeah, OrderedLogit's really cool.

562

00:48:19,705 --> 00:48:21,486

And again, Order Beta,

563

00:48:22,422 --> 00:48:33,211

It's really about thinking out of the box because this was an area, as I mentioned, where

it's somewhat well-trodden issue.

564

00:48:33,211 --> 00:48:41,390

And that was something where it was sort of really combining two things that are very

different.

565

00:48:41,390 --> 00:48:48,284

So I ordered logits model for discrete data, betas for continuous data, and then was

really combining them that made it work.

566

00:48:50,670 --> 00:48:56,753

People think of statistics as a sort of dry rote, read formulas off of a page.

567

00:48:57,274 --> 00:49:01,456

But in a lot of, think, actual problems, a lot of it's really creative thinking.

568

00:49:01,456 --> 00:49:03,797

How do you get stuck in a dead end?

569

00:49:03,797 --> 00:49:05,338

How do you find the way out?

570

00:49:05,338 --> 00:49:07,819

And sometimes that's taking a very different approach.

571

00:49:07,920 --> 00:49:09,380

Yeah, definitely.

572

00:49:09,380 --> 00:49:14,133

And so we should add these links to the show notes.

573

00:49:14,133 --> 00:49:19,546

So if you have anything to share regarding your package,

574

00:49:20,088 --> 00:49:29,625

order beta, please add that to the show notes for the people because I know a lot of them

are going to want to dig deeper.

575

00:49:29,625 --> 00:49:39,231

And I'll also add the link to Michael's case study about logistic for sure.

576

00:49:39,311 --> 00:49:49,678

My football project doesn't have anything yet that's ready to be shared, but I will do

that as soon as possible.

577

00:49:49,946 --> 00:49:50,907

For sure.

578

00:49:50,907 --> 00:49:53,128

Maybe that's something we'll do.

579

00:49:53,809 --> 00:50:01,375

We'll teach at PyData with Chris Fonsbeck in PyData New York in November.

580

00:50:01,375 --> 00:50:02,716

Maybe we'll do that.

581

00:50:02,716 --> 00:50:05,228

We'll see how that works.

582

00:50:05,228 --> 00:50:08,430

Maybe at that point, I'll be able to share that and add that to the show notes.

583

00:50:08,430 --> 00:50:16,392

In the meantime, let's add your package or any paper or case study or things like that

that you've...

584

00:50:16,392 --> 00:50:26,286

return or read and think is interesting or even videos and tutorials and I'll add

Michael's case study.

585

00:50:26,927 --> 00:50:34,390

And so I know you had a hard step in a few minutes and I definitely want to talk about

your novel.

586

00:50:34,390 --> 00:50:44,780

I mean, I still have like tons of questions for you and the work you do and how you use

space and so on because honestly, I love all the work you do and I...

587

00:50:44,780 --> 00:50:48,492

We could do like a three hour episode very easily.

588

00:50:49,494 --> 00:50:51,525

But I definitely want to talk about your novel.

589

00:50:51,525 --> 00:50:55,938

So let's do that because you're the first novelist on the show.

590

00:50:56,659 --> 00:51:06,806

So first question, you know, if I were talking to you in the street or in a bar, would be

like, why?

591

00:51:06,806 --> 00:51:10,788

What inspired you to write The Bayesian Hitman?

592

00:51:11,409 --> 00:51:12,390

Yeah.

593

00:51:12,814 --> 00:51:25,934

Well, my interest in writing, and this is the thing that for me, like doing statistics

kind of came a little bit later in life, my interest in writing predated my doing

594

00:51:25,934 --> 00:51:28,754

statistics by a number of years, actually.

595

00:51:28,754 --> 00:51:31,034

And I was always sort of interested in writing.

596

00:51:31,034 --> 00:51:40,614

I got into writing fiction, actually, when I was a diplomat in Saudi Arabia, where, not to

too fine a point on it, there's not very much to do in Saudi Arabia.

597

00:51:40,614 --> 00:51:43,006

And there's very few creative outlets.

598

00:51:43,138 --> 00:51:45,189

And I used to love doing things in the States.

599

00:51:45,189 --> 00:51:48,730

I used to actually be very much involved in improv theater at one time.

600

00:51:48,730 --> 00:51:51,800

And there was very little of that.

601

00:51:52,041 --> 00:51:53,581

But you can write a novel anywhere.

602

00:51:53,581 --> 00:51:56,022

And so that's actually where I got into writing.

603

00:51:56,022 --> 00:52:00,553

And all of that stuff that I wrote will be forever locked away in my computer and never

released.

604

00:52:00,553 --> 00:52:06,965

But eventually I just kept writing from time to time, even in grad school, just a really

nice outlet.

605

00:52:06,965 --> 00:52:09,245

It's a different way of using your brain.

606

00:52:09,586 --> 00:52:11,406

After I get locked into, you

607

00:52:11,406 --> 00:52:15,586

doing research papers and stuff and then fiction is just so different.

608

00:52:16,946 --> 00:52:18,166

At least it should be.

609

00:52:18,166 --> 00:52:19,786

I was posting on Twitter about this.

610

00:52:19,786 --> 00:52:24,186

There's some academic studies that have turned out to be fiction and have had to be

retracted lately.

611

00:52:24,186 --> 00:52:29,646

But in theory, as active as you are, not doing fiction, you're doing real research.

612

00:52:30,526 --> 00:52:39,516

And yeah, the genesis of this novel actually came out of me being on the academic job

market in the fall when I was a grad student, so my last year of grad school.

613

00:52:39,516 --> 00:52:40,998

I kind of had this

614

00:52:41,090 --> 00:52:55,838

I don't know, as you get these sort of visions, I had this idea of someone going to a

university that wasn't their top choice, that was in an area that they didn't like, but

615

00:52:55,838 --> 00:53:01,201

then things being radically different than they expect in a good way.

616

00:53:01,882 --> 00:53:09,226

And that was really where the idea of the novel came from because when I was on the job

market, the thing about the job market

617

00:53:09,226 --> 00:53:16,300

and that I think appeals as a sort of human story is how our lives are just so disrupted

as academics.

618

00:53:16,300 --> 00:53:21,973

know, people move, you know, across their country, sometimes across the world.

619

00:53:22,753 --> 00:53:25,295

They end up in places that they never thought they would live.

620

00:53:25,295 --> 00:53:27,356

They're culturally very different.

621

00:53:27,696 --> 00:53:30,697

And I thought that's a really great setting for a story.

622

00:53:32,038 --> 00:53:38,998

And yeah, and you know, I think a lot of writing advice when you're starting out that

you'll get is to write what you know and...

623

00:53:38,998 --> 00:53:40,419

And that was kind of what I knew.

624

00:53:40,419 --> 00:53:43,622

That was the world I was living in.

625

00:53:43,622 --> 00:53:49,606

One thing I want to clarify, it's not an autobiographical novel.

626

00:53:50,127 --> 00:53:52,208

And I always worry about that a bit.

627

00:53:52,589 --> 00:53:55,011

But the main character is not me.

628

00:53:55,151 --> 00:53:59,535

And it's actually set in a fictional town with a fictional university.

629

00:53:59,535 --> 00:54:01,797

I that very intentionally because I didn't want to like...

630

00:54:01,797 --> 00:54:07,031

I say the main character has, let's say, very uncensored opinions about his institution.

631

00:54:07,031 --> 00:54:08,682

And I didn't want to like...

632

00:54:09,346 --> 00:54:14,008

you know, critique somewhat, you know, an actual university or college.

633

00:54:14,568 --> 00:54:23,572

But the main character is very much informed by my experiences, but also by all my friends

and the things they went through and the places they traveled.

634

00:54:24,052 --> 00:54:35,216

And I think ultimately, as I wrote in the acknowledgement of the book, I really, wanted it

to be a novel about academia that was much more realistic, that really got into the

635

00:54:35,216 --> 00:54:39,256

problems people have, issues and the challenges they face, and to try to, you know,

636

00:54:39,256 --> 00:54:50,434

Because I felt a lot of writing about academics, it's always these like mysterious

literature professors that hang around in like, you know, beautiful Ivy League, you know,

637

00:54:50,434 --> 00:54:54,356

places and like solve crimes in their spare time or something.

638

00:54:54,356 --> 00:55:03,052

And I wanted it to be a little more hard hitting and, and also really, you know, about,

you know, life as a research academic and what that's like and the pressures you face.

639

00:55:03,052 --> 00:55:05,764

And so that was really the Genesis.

640

00:55:06,104 --> 00:55:09,038

And then the main character, you know,

641

00:55:09,038 --> 00:55:17,818

became about a Bayesian statistician, I don't know, I don't remember distinctly making

that choice, but the Khitmanger character is a Bayesian statistician, and that became

642

00:55:17,818 --> 00:55:29,118

really fun because I still think, I mean, it's impossible to verify this claim, but I'd

say there's a high posterior probability that it is the only novel that has a Bayesian

643

00:55:29,118 --> 00:55:31,278

statistician as the protagonist.

644

00:55:31,858 --> 00:55:36,662

And that was really fun, because it's fun getting to take an area that

645

00:55:36,662 --> 00:55:42,005

really doesn't appear in fiction and like put it into a story and you know, see what

happens.

646

00:55:42,045 --> 00:55:54,492

Yeah, I mean, that is really interesting to see how like, know, the way of thinking that

got you there.

647

00:55:54,492 --> 00:56:04,337

okay, so from like, that was one of the scenarios I had in my head where it's like

actually something you wanted to do already.

648

00:56:06,988 --> 00:56:09,979

So you write of course academic content.

649

00:56:09,979 --> 00:56:12,019

How was the experience?

650

00:56:12,019 --> 00:56:14,240

How different was the experience?

651

00:56:14,240 --> 00:56:20,222

And did you enjoy it more to write the fiction?

652

00:56:20,222 --> 00:56:29,744

Do you feel freer writing fictional work because you you don't have to check everything,

cite everything and things like that?

653

00:56:30,084 --> 00:56:33,685

Or was the experience in me and quite similar?

654

00:56:34,665 --> 00:56:36,386

That's a great question.

655

00:56:37,258 --> 00:56:44,384

I think that, unfortunately, think at the end, they tend to converge in some ways.

656

00:56:45,245 --> 00:56:50,910

But I think that's partly the, I'd say the initial work is very, very different.

657

00:56:50,910 --> 00:56:56,134

I when you're writing fiction, you really are, you're really trying to feed your creative

instinct.

658

00:56:56,415 --> 00:57:05,161

And hopefully when you're writing academic paper, that's not the, hopefully you have like

data that, you know, constrains, know, these sorts of things.

659

00:57:05,602 --> 00:57:16,945

But when you're trying to get that story out for the first time, and yeah, that's a very

different, and I don't know how good I am at that, but that's really about just going

660

00:57:16,945 --> 00:57:22,217

wherever the story leads and trying to figure out the next thing.

661

00:57:22,217 --> 00:57:29,829

And it's difficult, but it's definitely a very different kind of skill set or experience

than writing academic paper.

662

00:57:29,989 --> 00:57:33,630

I'd say that they tend to converge at the end because

663

00:57:34,362 --> 00:57:36,342

and this was my experience.

664

00:57:36,342 --> 00:57:44,569

I noticed your words in my first novel and that made me very nervous that maybe there will

be another novel.

665

00:57:44,569 --> 00:57:48,482

But yeah, it's not easy and it's not easy to finish a novel.

666

00:57:48,482 --> 00:57:59,760

think, I mean, it's hard to write one for sure, but it's that finishing that really,

that's where you have to come back and make it coherent and edit it.

667

00:57:59,760 --> 00:58:03,262

And my book was in a kind of editing phase for

668

00:58:03,566 --> 00:58:12,346

years probably of just going back and forth and adding things, taking away, trying to make

the story really move at the right pace.

669

00:58:12,346 --> 00:58:18,446

know, how do you have to, you don't want to have too much detail or too little and things

that become even technical.

670

00:58:18,446 --> 00:58:25,766

Like we have what are called plot holes where, you you said something happened in a

certain place, but that couldn't happen because the character was here.

671

00:58:26,026 --> 00:58:31,078

So then, you know, at the end of the day, you know, when you're finishing a novel really

comes down to a lot of details and

672

00:58:31,182 --> 00:58:36,364

cross-checking things and stuff that's not that different from finishing an academic

paper.

673

00:58:37,024 --> 00:58:44,427

so, and that's not, know, I mean, the editing phase is not usually people's favorite phase

unless they're kind of weird.

674

00:58:44,427 --> 00:58:49,190

But yeah, so I would say ultimately they start to converge.

675

00:58:49,190 --> 00:58:52,571

But of course, the writing style is very, very different.

676

00:58:53,151 --> 00:59:00,334

I do think, you know, honestly, you my advice would be especially for academic

researchers, I really encourage you to write.

677

00:59:00,334 --> 00:59:01,574

creatively.

678

00:59:01,854 --> 00:59:04,374

Creative fiction, creative nonfiction, poetry.

679

00:59:04,374 --> 00:59:05,534

No one has to see it.

680

00:59:05,534 --> 00:59:06,624

No one has to know.

681

00:59:06,624 --> 00:59:12,274

You can post it online under a pseudonym if you want, but it actually really does help you

write.

682

00:59:12,274 --> 00:59:15,894

And I would say it has made my academic writing better.

683

00:59:15,894 --> 00:59:24,034

mean, academic writing at the of the day, especially when it goes through after the peer

review process, you end up responding to reviewers and anything beautiful you made gets

684

00:59:24,034 --> 00:59:25,134

destroyed.

685

00:59:26,414 --> 00:59:29,154

But also, I write a lot of blog posts.

686

00:59:29,154 --> 00:59:36,054

These have almost, my blog posts have far greater reach than my academic writing, I'm sure

of that.

687

00:59:36,054 --> 00:59:39,694

And those are definitely much more informed by my creative writing.

688

00:59:39,694 --> 00:59:47,534

And the nice thing is they're not peer reviewed and I'm able to add tone, I'm able to add

fun things.

689

00:59:47,534 --> 00:59:55,294

And that definitely is much more connected to my fiction writing and issues of clarity.

690

00:59:55,398 --> 01:00:01,660

But I'd say that you really can become a better, it's sort of like cross training as an

athlete.

691

01:00:01,981 --> 01:00:04,122

Trying out different forms is really helpful.

692

01:00:04,122 --> 01:00:12,145

I'm sure that writing my novel, I didn't write it to get ahead, although I've joked with

people that part of my tenure case is writing a novel.

693

01:00:12,245 --> 01:00:14,846

But I don't think it'll actually will help.

694

01:00:14,846 --> 01:00:24,340

But I'd say without maybe meaning to it, it definitely has made me a better writer and

that's great for my career at the end of the day.

695

01:00:24,878 --> 01:00:28,958

Yeah, I completely agree with that.

696

01:00:28,958 --> 01:00:32,538

Personally, I hate reading academic papers.

697

01:00:32,638 --> 01:00:38,798

I do it because I have to, but each time I have to do that, I'm like, my God, no.

698

01:00:38,798 --> 01:00:52,938

Actually, something I would like to try is feed a paper to Chetjipiti and ask it to

rewrite it from a very, more like a novel or a more exciting tone because honestly, the

699

01:00:52,938 --> 01:00:54,722

writing is just terrible.

700

01:00:54,722 --> 01:00:56,903

Yeah, count me a story, know, something like that.

701

01:00:56,903 --> 01:01:09,628

But as you were saying, like to your point, really like Richard McArif's style because in,

I talked a bit about that with him at Stankon.

702

01:01:10,148 --> 01:01:14,730

So I won't, you know, divulge all the details because I don't know if he wants to.

703

01:01:14,830 --> 01:01:24,622

But long story short, it's like he also, you know, he, he definitely trained his writing

styles and he's aware of things he wants to.

704

01:01:24,622 --> 01:01:27,142

to how he wants to do it.

705

01:01:27,142 --> 01:01:33,342

And I think that's also a big part of why his book has been so successful.

706

01:01:33,342 --> 01:01:36,182

It's the writing, it's so much more engaging.

707

01:01:36,182 --> 01:01:47,902

And I've never understood that, honestly, from the academic world, where it's like, no, it

seems like to look serious, you have to be as boring as possible.

708

01:01:47,902 --> 01:01:50,342

And that's just terrible.

709

01:01:50,342 --> 01:01:54,142

That doesn't make people wanna read papers.

710

01:01:54,602 --> 01:02:05,385

I'm not saying it should be completely entertaining and not saying it should be tick

tocking on papers on the country, but people like stories.

711

01:02:05,445 --> 01:02:16,888

And if you can tell stories and at the same time, teach them something or show them a new

method, I think that's much better.

712

01:02:16,888 --> 01:02:17,768

Everybody wins.

713

01:02:17,768 --> 01:02:20,379

And I think it's also much more enjoyable for the writer.

714

01:02:20,379 --> 01:02:23,944

So, you know, why not do a

715

01:02:23,944 --> 01:02:26,615

little bit more of that.

716

01:02:26,776 --> 01:02:36,582

that's why I think it's awesome that Richard writes that way, that you also like to write

that way and you're even writing novels.

717

01:02:36,582 --> 01:02:42,754

I think that's awesome and that's probably going to change gradually.

718

01:02:42,754 --> 01:02:45,197

I know other authors also doing that.

719

01:02:45,197 --> 01:02:49,654

Osvaldo Martin, for instance, who's going to be back on the show next week.

720

01:02:49,654 --> 01:02:53,915

Osvaldo was the first ever guest of Learning Patient Statistics.

721

01:02:53,915 --> 01:02:58,596

Episode one was with him five years ago.

722

01:02:59,257 --> 01:03:03,298

And he's been kind of a mentor to me.

723

01:03:03,298 --> 01:03:06,759

And I really like his side of writing, for instance, also.

724

01:03:06,999 --> 01:03:09,419

He's someone who has a lot of humor.

725

01:03:09,600 --> 01:03:13,360

that, know, you can see that in the writing.

726

01:03:13,861 --> 01:03:18,528

Also, I like it because he doesn't, you know, drown.

727

01:03:18,528 --> 01:03:21,881

you with technical details from the get-go.

728

01:03:21,881 --> 01:03:28,026

His writing is much more applied where it's like, okay, let me tell you about Bayesian

additive regression trees.

729

01:03:28,026 --> 01:03:32,570

Here's the theory you need to know, but not too much.

730

01:03:32,831 --> 01:03:35,853

in, okay, here is how we do it.

731

01:03:35,853 --> 01:03:37,314

Here are the limitations and so on.

732

01:03:37,314 --> 01:03:40,797

And I think that's much more efficient and also much more engaging.

733

01:03:41,738 --> 01:03:42,679

totally.

734

01:03:42,679 --> 01:03:46,382

And I think blogs have changed statistics.

735

01:03:47,098 --> 01:03:58,612

maybe even helped create data science because it's this form of publishing that allows us

to just skirt around the whole archaic academic system.

736

01:03:58,612 --> 01:04:08,064

you're just, you know, especially for listeners who aren't, let's say, aren't as plugged

into academia, haven't been through like the paper publishing process.

737

01:04:08,064 --> 01:04:14,112

mean, reviewers really do like, you know, even if you try to, I mean,

738

01:04:14,112 --> 01:04:16,164

If you write a paper better, it'll be a better paper.

739

01:04:16,164 --> 01:04:24,770

they really, I mean, I've had reviewers like kill titles because they didn't think that

they were like academic-y enough.

740

01:04:25,732 --> 01:04:30,896

you know, and the title I had to replace it with was definitely like a worse title, right?

741

01:04:30,896 --> 01:04:37,541

So, and I've had reviewers like comments say, your writing style is too informal.

742

01:04:37,541 --> 01:04:40,934

Like nothing to do with the actual substance of paper.

743

01:04:40,934 --> 01:04:43,256

Just this doesn't sound, you know,

744

01:04:43,256 --> 01:04:43,656

technical.

745

01:04:43,656 --> 01:04:54,914

So when you're reading an academic paper, and it's turgid, it's like, well, some of that

is I mean, and also, I mean, some people just aren't like struggle to write some people,

746

01:04:54,914 --> 01:04:57,236

English is not a language they're very comfortable with.

747

01:04:57,236 --> 01:05:02,260

And so that, you know, there's not everyone's going to be Richard McElwreath, like that

guy has a gift.

748

01:05:02,260 --> 01:05:03,240

Okay.

749

01:05:03,981 --> 01:05:11,476

But, but I think to, you know, blogs allow people to just completely circum you know, do

an end runner out that

750

01:05:12,094 --> 01:05:22,099

The other person I mentioned who's fantastic blogger on stats issues, I don't know if he

would explicitly say he's Bayesian, but a lot of his stuff is, is Andrew Heiss.

751

01:05:22,099 --> 01:05:25,220

He's a fellow political scientist at Georgia State.

752

01:05:26,040 --> 01:05:27,650

And we can put his link in the blog.

753

01:05:27,650 --> 01:05:38,596

And he is, I think, really one of the best sort of stats bloggers out there because he has

a remarkable gift for visualization, but he's also very good at sort of the explanation

754

01:05:38,596 --> 01:05:39,466

side.

755

01:05:40,558 --> 01:05:51,818

And so if there's people listening who have not added him to their list, I he's really

great at, and very applied, you know, like the embedded code chunks in the blog and stuff.

756

01:05:51,818 --> 01:05:54,658

So yeah, I mean, so I think this is all fantastic.

757

01:05:54,918 --> 01:06:04,318

And, you know, the only challenge, and I know as academic who does some blogging is, you

know, we're not paid to do it, and it is a public good.

758

01:06:04,838 --> 01:06:09,862

And when I say we're not paid to do it, mean, theoretically, yes, like,

759

01:06:09,888 --> 01:06:14,679

anything you do as academic you're quote unquote paid for, but it's not really part of

your valuation.

760

01:06:14,679 --> 01:06:18,490

Most people ignore it, right, for tenure and for these sorts of things.

761

01:06:18,490 --> 01:06:20,861

And so I'd say it's not incentivized.

762

01:06:20,861 --> 01:06:23,342

It's up to people to do it yourself.

763

01:06:23,582 --> 01:06:32,284

I think my projects have been, I mean, I think some of my blog posts that relate to my

academic work, they help with, let's say, getting citations.

764

01:06:32,304 --> 01:06:37,166

But at the end of the day, it's something that you kind of have to want to do.

765

01:06:38,030 --> 01:06:47,995

But I would say personally, I found it just so rewarding to write that way and to see

stuff get out there in the world.

766

01:06:47,995 --> 01:06:49,385

I'll just throw this in.

767

01:06:49,385 --> 01:06:51,496

It's just a funny tidbit.

768

01:06:51,876 --> 01:07:04,692

My most visited blog post for years now is actually a blog post I wrote about a pregnancy

test called a cell-free pregnancy screening for Down syndrome.

769

01:07:05,614 --> 01:07:14,354

which if you know anything about being in cystics, right, testing is this like core part,

you it's like the example all the intro books use, you know, some kind of how many

770

01:07:14,354 --> 01:07:16,674

vampires are in the population or whatever.

771

01:07:16,674 --> 01:07:26,294

And so this blog post came out of actually a very personal story of my wife and I having a

child who was tested or recommended for testing.

772

01:07:26,354 --> 01:07:34,274

And to make a long story short, I was really upset about how these tests were being

interpreted in a way that was like statistically invalid.

773

01:07:34,274 --> 01:07:34,554

right?

774

01:07:34,554 --> 01:07:39,637

Not being aware of prior distributions and how they affect the interpretation of the test.

775

01:07:39,677 --> 01:07:48,081

So I wrote a blog post about this and I tried to make it very, very clear, even to people

of no stats background, right?

776

01:07:48,202 --> 01:07:50,352

And I mean, it's just my personal blog.

777

01:07:50,352 --> 01:07:56,975

I just posted it up there and somehow it's become one of the top Google search results for

this particular test.

778

01:07:56,975 --> 01:07:59,334

It's called Maternity 21.

779

01:07:59,728 --> 01:08:00,629

It's a top five.

780

01:08:00,629 --> 01:08:03,910

So it gets like 100

781

01:08:03,916 --> 01:08:08,450

view, you know, unique views a week, sometimes higher than that of people.

782

01:08:08,450 --> 01:08:11,692

And I've gotten like emails from people all over the world.

783

01:08:12,186 --> 01:08:14,955

And sometimes I have to say, I'm sorry, I'm not a medical doctor.

784

01:08:14,955 --> 01:08:18,738

I'm just commenting on, you know, how you correctly interpret the statistics of a test.

785

01:08:18,738 --> 01:08:23,181

like that's like as an academic, I guess really cool to have that kind of impact.

786

01:08:23,482 --> 01:08:28,186

And I thought that was going to be a post I wrote that would be quickly forgotten, but

ended up.

787

01:08:28,186 --> 01:08:32,779

And it's still it's people read it all the time.

788

01:08:32,779 --> 01:08:33,900

And hopefully,

789

01:08:33,966 --> 01:08:36,786

make fewer statistical errors, right?

790

01:08:36,806 --> 01:08:39,026

From using those tests.

791

01:08:39,026 --> 01:08:41,086

anyway, yeah.

792

01:08:41,186 --> 01:08:43,006

Yeah, yeah, for sure.

793

01:08:43,006 --> 01:08:54,806

That's cool because that was actually going to be a question I had for you, know, like the

way you saw your novel in your writing in general contribute to public understanding of

794

01:08:54,806 --> 01:08:57,846

patient stance and scientific thinking in general.

795

01:08:58,046 --> 01:08:58,526

Yeah.

796

01:08:58,526 --> 01:09:03,192

But since we're short on time, because I think you have to leave in like 14 minutes.

797

01:09:05,976 --> 01:09:11,900

I'm going to ask the last two questions to ask you that I asked every guest at the end of

the show.

798

01:09:11,900 --> 01:09:16,383

But before that, one last question regarding your novel.

799

01:09:17,284 --> 01:09:22,768

Did you get any feedback already from your academic peers and readers?

800

01:09:22,768 --> 01:09:25,229

And what kind of feedback was it?

801

01:09:26,870 --> 01:09:28,301

Not a ton yet.

802

01:09:28,301 --> 01:09:32,133

mean, know, novel's only been out for a month and we're all busy.

803

01:09:32,940 --> 01:09:36,331

So the people who have read it really like it.

804

01:09:36,331 --> 01:09:46,546

Of course, as a Beijing statistician who works on, I know that the reported feedback is

not always the same as the true feedback, right?

805

01:09:46,546 --> 01:09:48,416

That's a latent quantity.

806

01:09:48,936 --> 01:10:01,002

So that being said, the way I would interpret people's feedback so far is that, wow, this

novel is actually not that bad.

807

01:10:02,269 --> 01:10:12,510

I mean, it has a sort of claim to fame as being one of the first novels on the basis that

it's such a protagonist, but that doesn't mean that it's actually fun to read.

808

01:10:12,510 --> 01:10:16,230

But the people who have gotten through it said, it's actually well-paced.

809

01:10:16,230 --> 01:10:19,890

There's some kind of mystery thriller elements in it.

810

01:10:19,890 --> 01:10:22,010

The plot moves along nicely.

811

01:10:22,010 --> 01:10:22,990

They enjoyed it.

812

01:10:22,990 --> 01:10:27,730

And that's honestly what I want people to take from it the end of the day.

813

01:10:27,930 --> 01:10:30,982

There are some things in the book that

814

01:10:31,412 --> 01:10:33,433

If you know me, you're probably not surprising.

815

01:10:33,433 --> 01:10:39,946

Some things get into some questions about science, what is it about, philosophy of

science, things like this.

816

01:10:40,226 --> 01:10:45,889

But it's not too heavy-handed and it's fun and it's a little bit escapist.

817

01:10:45,889 --> 01:10:52,732

And that's what I wanted was for people who do research to have a fun book to read and

enjoy it.

818

01:10:53,153 --> 01:10:56,694

It doesn't get super into the weeds on Bayesian stats.

819

01:10:56,952 --> 01:11:00,385

There is some, and actually you mentioned Gaussian processes.

820

01:11:00,385 --> 01:11:01,706

That's one of the few.

821

01:11:01,706 --> 01:11:07,791

There is actually time when the characters sort of make fun of Gaussian process

regression.

822

01:11:07,791 --> 01:11:10,994

So I was very happy that that ended up in the novel.

823

01:11:10,994 --> 01:11:15,818

And maybe I'll get angry emails from people who love Gaussian processes.

824

01:11:15,818 --> 01:11:16,799

Yeah, for me.

825

01:11:16,799 --> 01:11:18,079

Yeah, for sure.

826

01:11:18,861 --> 01:11:19,341

Yeah.

827

01:11:19,341 --> 01:11:20,362

I mean, yeah.

828

01:11:20,362 --> 01:11:25,696

I mean, I'd say if you understand Bayesian statistics, you'll understand more of what the

main character is doing.

829

01:11:26,415 --> 01:11:34,661

I don't write out models in the book, but you have a much better sense of the technical

side of what's happening.

830

01:11:34,661 --> 01:11:43,308

in the novel, I tried to make it more fun for people, even people who don't have a

background at all in statistics and stuff like that.

831

01:11:44,249 --> 01:11:45,870

But yeah.

832

01:11:45,911 --> 01:11:48,032

So it's still early for...

833

01:11:48,133 --> 01:11:51,045

There's a lot of people who have the book and are reading it.

834

01:11:51,045 --> 01:11:52,196

Yeah.

835

01:11:56,042 --> 01:12:02,624

If the pilot distribution of feedback is the same as the true distribution, so far people

enjoyed it.

836

01:12:02,984 --> 01:12:03,684

Nice.

837

01:12:03,684 --> 01:12:05,175

Yeah, that's awesome.

838

01:12:05,175 --> 01:12:20,369

And well done again for taking the time of doing that because I know how long it takes to

write a book and how much dedication and sacrifice of free time it asks for.

839

01:12:20,369 --> 01:12:22,620

So yeah, thanks a lot for doing that.

840

01:12:22,620 --> 01:12:24,290

I think it's super...

841

01:12:26,059 --> 01:12:28,661

important for science communication.

842

01:12:28,661 --> 01:12:39,110

And I do think we should teach science from much more of a storytelling perspective

because science is done by people.

843

01:12:39,110 --> 01:12:44,735

And this is not just a bunch of dry theorems and papers.

844

01:12:44,735 --> 01:12:47,888

So I think your novel definitely contributes to that.

845

01:12:47,888 --> 01:12:52,982

So thanks a lot, And now...

846

01:12:52,982 --> 01:13:00,507

So I need to ask you the last two questions before you can get out to your next

engagement.

847

01:13:02,629 --> 01:13:08,453

first one, if you had unlimited time and resources, which problem would you try to solve?

848

01:13:08,453 --> 01:13:12,555

And caveat is that you have to solve it with the Gaussian process.

849

01:13:15,518 --> 01:13:16,482

No, of course not.

850

01:13:16,482 --> 01:13:21,534

It's just, what problem if you had?

851

01:13:21,534 --> 01:13:23,355

limited time and resources?

852

01:13:23,515 --> 01:13:25,636

Yeah, limited time and resources.

853

01:13:26,477 --> 01:13:32,060

Well, would buy a football team.

854

01:13:32,060 --> 01:13:33,540

Important for the world.

855

01:13:33,661 --> 01:13:38,663

I have actually thought about this, but I do a lot of work with online surveys.

856

01:13:39,624 --> 01:13:46,928

I do think that we haven't really fully exploited them.

857

01:13:47,708 --> 01:13:51,350

People maybe shouldn't be on social media as much as they are, but they're on at a time.

858

01:13:51,526 --> 01:13:55,337

And there's incredible possibilities through that for data collection.

859

01:13:55,337 --> 01:14:08,571

And one thing that I think would be really cool would be to do real-time online surveys

across the whole world that happen almost every day.

860

01:14:08,831 --> 01:14:15,433

So this is, yeah, I guess, the social scientists dream.

861

01:14:15,433 --> 01:14:18,954

But for some of stuff I study, like

862

01:14:19,998 --> 01:14:29,105

corruption, like how people report issues of corruption or what's happening with their

business or things that look shady.

863

01:14:29,707 --> 01:14:35,532

Having a survey like that that would run around the world all the time every day would be

pretty awesome.

864

01:14:35,532 --> 01:14:37,753

Facebook did this during COVID.

865

01:14:37,753 --> 01:14:47,541

They had a COVID poll and made me so jealous because they could, because they're Facebook,

they could just have this thing appear on people's feeds.

866

01:14:48,078 --> 01:14:50,749

would just pop up and say, want to take a survey about COVID.

867

01:14:50,749 --> 01:15:00,603

So there's this incredible data out there of like daily, sometimes it gets down to like

the county or state level of, know, how many, you know, stuff like, you know, how much

868

01:15:00,603 --> 01:15:02,356

contact do people have with other people?

869

01:15:02,356 --> 01:15:06,326

And we have this information, like literally almost the entire world.

870

01:15:06,326 --> 01:15:07,786

It's just stunning.

871

01:15:07,786 --> 01:15:17,390

And so I would love to do, to do that kind of thing about, you know, topics I care about,

like corruption and just see, you know, because in general in academia, we do,

872

01:15:17,422 --> 01:15:20,262

The most we get away with is like a sort of point in time survey.

873

01:15:20,262 --> 01:15:23,842

Like here's what people thought about President Trump at this particular time.

874

01:15:23,842 --> 01:15:27,302

But longitudinal data is so much more interesting.

875

01:15:27,302 --> 01:15:32,442

And there's so many more important questions you can ask when you get into things like

when do people change their minds?

876

01:15:32,442 --> 01:15:33,772

How do they change their minds?

877

01:15:33,772 --> 01:15:36,542

Like even with the current election, which we haven't discussed yet.

878

01:15:36,542 --> 01:15:39,002

So we have to discuss that, right?

879

01:15:39,042 --> 01:15:46,182

But, you know, like there's all this, you know, conversation about, who supports, you

know, Kamala Harris, who supports Donald Trump.

880

01:15:46,184 --> 01:15:49,586

And the thing is we don't really have longitudinal data.

881

01:15:49,586 --> 01:15:54,518

So we really don't know who has changed their mind or not because we don't know.

882

01:15:55,099 --> 01:15:58,320

People answer a survey and they say which one they like at the moment.

883

01:15:58,341 --> 01:16:02,302

But does that mean they really changed their mind or just that's who they saw on TV last

night?

884

01:16:02,302 --> 01:16:05,244

That's the most recent candidate they've heard of.

885

01:16:05,845 --> 01:16:10,477

What we really want to know are people who used to support one candidate, now they support

another.

886

01:16:10,477 --> 01:16:11,748

Those are the really interesting ones.

887

01:16:11,748 --> 01:16:14,209

So that's sort of my dream ambition.

888

01:16:14,209 --> 01:16:15,550

Maybe that's a very

889

01:16:15,690 --> 01:16:20,632

I don't know, uninteresting dream ambition, that would be one thing I would love to do.

890

01:16:20,632 --> 01:16:22,832

And it really would require unlimited funding.

891

01:16:22,832 --> 01:16:26,853

So if you know of a source of unlimited funding, please put me in touch.

892

01:16:27,714 --> 01:16:28,244

Yeah.

893

01:16:28,244 --> 01:16:36,656

mean, if I first use it and then I'll tell you, I'll tell you I want.

894

01:16:36,656 --> 01:16:37,896

Great answer.

895

01:16:38,117 --> 01:16:39,167

I'm not surprised.

896

01:16:39,167 --> 01:16:43,238

You seem like you're really passionate about what you're doing.

897

01:16:43,406 --> 01:16:50,646

I'm not surprised you came up with a very appropriate answer and of course, a very nerdy

answer.

898

01:16:50,646 --> 01:16:52,346

was really hoping for that.

899

01:16:52,346 --> 01:16:54,466

That's a prerequisite to be on the show.

900

01:16:54,466 --> 01:16:55,846

know.

901

01:16:56,426 --> 01:17:06,286

And so second question, if you could have dinner with any great scientific mind, dead,

alive or fictional, who would it be?

902

01:17:06,286 --> 01:17:07,026

Yeah.

903

01:17:07,026 --> 01:17:08,926

So this is a great question.

904

01:17:08,926 --> 01:17:11,466

And when I've had to think about

905

01:17:11,466 --> 01:17:17,549

And but it ended up being actually very clear for me and that that's Leonardo da Vinci.

906

01:17:18,410 --> 01:17:20,891

And I've always heard about him.

907

01:17:20,891 --> 01:17:23,752

was it took a trip to Italy a few years ago.

908

01:17:24,033 --> 01:17:26,904

And there's a museum of Leonardo da Vinci.

909

01:17:26,904 --> 01:17:28,936

I believe it's in Rome, but don't quote me on that.

910

01:17:28,936 --> 01:17:32,157

That's not my if I remember correctly, is in Rome.

911

01:17:32,478 --> 01:17:36,160

And it wasn't the biggest museum, but it was so fascinating.

912

01:17:36,160 --> 01:17:40,878

And what I loved about this guy and this kind of relates to our conversation right about

913

01:17:40,878 --> 01:17:42,379

statisticians writing novels.

914

01:17:42,379 --> 01:17:44,421

This guy had no rules, right?

915

01:17:44,421 --> 01:17:52,306

And he would, you know, he'd like wake up one day, he'd paint the Mona Lisa, he'd wake up

the next day, he'd like invent a new way to build a dam.

916

01:17:52,546 --> 01:18:02,934

And no one was there to tell him like, hey, you should, you know, no, no, no, you're an

artist, like you should just stay painting all the time, or, my gosh, you're really good

917

01:18:02,934 --> 01:18:07,647

at, you know, science, like you should, you know, just write scientific treatises.

918

01:18:07,647 --> 01:18:09,698

Like he just decided he was going to do it all.

919

01:18:09,698 --> 01:18:22,824

Now, obviously he was tremendously gifted and maybe there's not another person alive who

has ever been that multi-talented, but I just think that's so fascinating.

920

01:18:23,244 --> 01:18:35,389

His scientific discoveries probably don't measure up to let's say Newton or Bacon or

Einstein, but I think as a person he's fascinating in the way that he's doing art.

921

01:18:37,048 --> 01:18:38,969

He's doing science and the two can blend together.

922

01:18:38,969 --> 01:18:42,712

And so I think for me, hands down, I just love to sit down and chat with him.

923

01:18:42,712 --> 01:18:45,234

And where did you get your ideas from, right?

924

01:18:45,234 --> 01:18:54,570

Where did they, is steady stream of insights and you look him up, I mean, his

contributions across fields are staggering, right?

925

01:18:54,570 --> 01:19:01,214

And, yeah, so that's that he gets my vote.

926

01:19:01,435 --> 01:19:03,396

If you can set that up too, that'd be great.

927

01:19:03,396 --> 01:19:06,858

If you know how to bring dead people to life or whatever.

928

01:19:07,342 --> 01:19:14,584

Well, I definitely will and I will join the dinner because honestly, yeah, I think it's

great choice.

929

01:19:16,025 --> 01:19:19,746

yeah, some other people also have made that choice.

930

01:19:19,746 --> 01:19:21,826

So that will be a very interesting dinner.

931

01:19:21,826 --> 01:19:25,767

I thought it was super original, but I guess not.

932

01:19:25,767 --> 01:19:28,248

I mean, it is original.

933

01:19:28,248 --> 01:19:33,249

It's not the bulk of the distribution, but you're not the first one.

934

01:19:33,429 --> 01:19:35,080

Yeah, that's fine.

935

01:19:35,080 --> 01:19:36,360

Yeah, it's a great one.

936

01:19:36,934 --> 01:19:42,336

It would have been very original if you wanted to optimize that to say myself.

937

01:19:42,336 --> 01:19:56,719

If you have the source of unlimited funding, I would definitely be saying that.

938

01:19:57,520 --> 01:20:05,122

If you like Leonardo da Vinci stuff as I do, there is in my native region in France,

939

01:20:05,646 --> 01:20:17,216

There is his last house, which was offered to him, gifted to him by the King of France,

Francis I.

940

01:20:17,978 --> 01:20:25,605

And at the time, the King was in the small city of Amboise, which is in the Loire Valley.

941

01:20:25,605 --> 01:20:29,318

And so if you go to, so I definitely recommend the region.

942

01:20:29,318 --> 01:20:31,160

This is really.

943

01:20:31,160 --> 01:20:31,960

It's really beautiful.

944

01:20:31,960 --> 01:20:37,554

It's like, it's the same vibe as Tuscany since you know Italy, but without the mountains.

945

01:20:37,554 --> 01:20:46,698

But it's great food, great wine, lots of history, lots of castles.

946

01:20:46,759 --> 01:20:57,624

Leonardo da Vinci spent his last years in Amboise and his castle, which is called the Clos

Lucé, is actually a museum now that you can visit.

947

01:20:57,624 --> 01:21:00,806

Lots of his inventions are there.

948

01:21:00,950 --> 01:21:09,555

even handwritten notes, always very original because he was writing with the left hand

from right to left.

949

01:21:09,555 --> 01:21:12,096

So it's very hard to decipher actually.

950

01:21:12,596 --> 01:21:20,200

There's an amazing park and there is also a bit of, there are some vines because he was

making some wine.

951

01:21:20,401 --> 01:21:22,091

So yeah, definitely recommend that.

952

01:21:22,091 --> 01:21:24,482

That's a really beautiful place.

953

01:21:24,703 --> 01:21:25,973

That sounds fascinating.

954

01:21:25,973 --> 01:21:31,176

And I didn't need another reason to go visit France, but you've given me another one, so I

will.

955

01:21:31,470 --> 01:21:34,292

I'll have another excuse to visit for sure.

956

01:21:34,472 --> 01:21:35,283

Yeah.

957

01:21:35,283 --> 01:21:36,464

Thank you.

958

01:21:36,464 --> 01:21:38,846

also it's not a very touristic one.

959

01:21:38,846 --> 01:21:43,939

mean, it is touristic, but it's mainly European tourism and some Americans.

960

01:21:45,021 --> 01:21:53,477

I mean, now that I've talked on the podcast, of course it's going to become much more

touristic because I'm kind of an influencer, but it's just like...

961

01:21:53,477 --> 01:21:53,938

Yeah.

962

01:21:53,938 --> 01:21:57,345

All these really nerdy people will show up and...

963

01:21:57,421 --> 01:21:58,842

Yeah, know, trying to.

964

01:21:58,842 --> 01:22:00,222

T-shirts and stuff.

965

01:22:00,222 --> 01:22:00,482

Yeah.

966

01:22:00,482 --> 01:22:03,282

Did you get a Stan T-shirt at the conference or?

967

01:22:03,842 --> 01:22:07,942

No, I don't think there were T-shirts this year but I have some cool stickers.

968

01:22:07,942 --> 01:22:08,582

have some cool stickers.

969

01:22:08,582 --> 01:22:09,182

Okay, okay.

970

01:22:09,182 --> 01:22:10,012

Yeah, they have stickers.

971

01:22:10,012 --> 01:22:10,342

Yeah.

972

01:22:10,342 --> 01:22:19,442

I got a T-shirt back when but I will say the first Stan conference, I wouldn't say it was

the best because I think probably the qualities, know, like there's so many more people

973

01:22:19,442 --> 01:22:22,382

use Stan but it was in California on the coast.

974

01:22:22,382 --> 01:22:24,930

Like they had a resort like that was on the Pacific.

975

01:22:24,930 --> 01:22:26,621

was pretty, it was very pretty.

976

01:22:26,621 --> 01:22:27,492

yeah.

977

01:22:27,492 --> 01:22:27,742

Yeah.

978

01:22:27,742 --> 01:22:29,254

It was really fun.

979

01:22:29,254 --> 01:22:31,035

But yeah.

980

01:22:31,035 --> 01:22:32,966

No, this year was in Oxford University.

981

01:22:32,966 --> 01:22:35,719

So as expected, it was raining.

982

01:22:37,681 --> 01:22:40,673

But the, but the university is pretty cool to look at.

983

01:22:40,673 --> 01:22:40,923

Yeah.

984

01:22:40,923 --> 01:22:41,184

Yeah.

985

01:22:41,184 --> 01:22:42,755

No, that's, that's absolutely beautiful.

986

01:22:42,755 --> 01:22:52,013

And, and, again, like, you know, you, you go to, to the UK, you expect it to be raining,

you know, so that's why it's like going to France and not expecting some strikes.

987

01:22:52,013 --> 01:22:53,414

It's like,

988

01:22:53,806 --> 01:22:56,306

You're missing part of the experience, you know.

989

01:22:59,006 --> 01:22:59,686

Awesome.

990

01:22:59,686 --> 01:23:01,416

Well, Bob, I need to let you go.

991

01:23:01,416 --> 01:23:06,026

know you have another engagement, but thank you so much for taking the time.

992

01:23:06,146 --> 01:23:07,626

That was absolutely great.

993

01:23:07,626 --> 01:23:08,666

Awesome conversation.

994

01:23:08,666 --> 01:23:15,886

I have still a gazillion questions for you, but let's do that when your next novel comes

around.

995

01:23:15,886 --> 01:23:18,076

So you told me in about three months, right?

996

01:23:18,076 --> 01:23:19,666

So that's awesome.

997

01:23:19,946 --> 01:23:21,106

my gosh.

998

01:23:22,186 --> 01:23:29,429

And yeah, as usual, I put resources and a link to your website in the show notes for those

who want to dig deeper.

999

01:23:29,429 --> 01:23:32,830

Thank you again, Bob, for taking the time and being on the show.

Speaker:

01:23:32,890 --> 01:23:33,520

Thank you, man.

Speaker:

01:23:33,520 --> 01:23:36,921

I had a great time and I'll definitely get everything else to you.

Speaker:

01:23:36,921 --> 01:23:37,629

But thanks so much.

Speaker:

01:23:37,629 --> 01:23:39,012

It was so fun to this conversation.

Speaker:

01:23:39,012 --> 01:23:40,039

You asked great questions.

Speaker:

01:23:40,039 --> 01:23:41,533

made me think a lot.

Speaker:

01:23:41,933 --> 01:23:43,174

So I really appreciate it.

Speaker:

01:23:43,174 --> 01:23:44,834

I hope you have a good week.

Speaker:

01:23:48,686 --> 01:23:52,369

This has been another episode of Learning Bayesian Statistics.

Speaker:

01:23:52,369 --> 01:24:02,868

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

Speaker:

01:24:02,868 --> 01:24:06,961

episodes to help you reach true Bayesian state of mind.

Speaker:

01:24:06,961 --> 01:24:08,903

That's learnbaystats.com.

Speaker:

01:24:08,903 --> 01:24:13,767

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lars and Meghiraam.

Speaker:

01:24:13,767 --> 01:24:16,929

Check out his awesome work at bababrinkman.com.

Speaker:

01:24:16,929 --> 01:24:18,102

I'm your host.

Speaker:

01:24:18,102 --> 01:24:18,943

Alex Andorra.

Speaker:

01:24:18,943 --> 01:24:23,202

You can follow me on Twitter at Alex underscore Andorra like the country.

Speaker:

01:24:23,202 --> 01:24:30,573

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

Speaker:

01:24:30,573 --> 01:24:32,955

Thank you so much for listening and for your support.

Speaker:

01:24:32,955 --> 01:24:35,267

You're truly a good Bayesian.

Speaker:

01:24:35,267 --> 01:24:42,083

Change your predictions after taking information in and if you're thinking I'll be less

than amazing.

Speaker:

01:24:42,083 --> 01:24:45,624

Let's adjust those expectations.

Speaker:

01:24:45,624 --> 01:24:58,593

me show you how to be a good Bayesian Change calculations after taking fresh data in Those

predictions that your brain is making Let's get them on a solid foundation

Previous post