*Proudly sponsored by **PyMC Labs**, the Bayesian Consultancy. **Book a call**, or **get in touch**!*

In this episode, Jonathan Templin, Professor of Psychological and Quantitative Foundations at the University of Iowa, shares insights into his journey in the world of psychometrics.

Jonathan’s research focuses on diagnostic classification models — psychometric models that seek to provide multiple reliable scores from educational and psychological assessments. He also studies Bayesian statistics, as applied in psychometrics, broadly. So, naturally, we discuss the significance of psychometrics in psychological sciences, and how Bayesian methods are helpful in this field.

We also talk about challenges in choosing appropriate prior distributions, best practices for model comparison, and how you can use the Multivariate Normal distribution to infer the correlations between the predictors of your linear regressions.

This is a deep-reaching conversation that concludes with the future of Bayesian statistics in psychological, educational, and social sciences — hope you’ll enjoy it!

*Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at **https://bababrinkman.com/** !*

**Thank you to my Patrons for making this episode possible!**

*Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca and Dante Gates*.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

**Links from the show:**

- Jonathan’s website: https://jonathantemplin.com/
- Jonathan on Twitter: https://twitter.com/DrJTemplin
- Jonathan on Linkedin: https://www.linkedin.com/in/jonathan-templin-0239b07/
- Jonathan on GitHub: https://github.com/jonathantemplin
- Jonathan on Google Scholar: https://scholar.google.com/citations?user=veeVxxMAAAAJ&hl=en&authuser=1
- Jonathan on Youtube: https://www.youtube.com/channel/UC6WctsOhVfGW1D9NZUH1xFg
- Jonathan’s book: https://jonathantemplin.com/diagnostic-measurement-theory-methods-applications/
- Jonathan’s teaching: https://jonathantemplin.com/teaching/
- Vehtari et al. (2016), Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC: https://arxiv.org/abs/1507.04544
- arviz.plot_compare: https://python.arviz.org/en/stable/api/generated/arviz.plot_compare.html
- LBS #35, The Past, Present & Future of BRMS, with Paul Bürkner: https://learnbayesstats.com/episode/35-past-present-future-brms-paul-burkner/
- LBS #40, Bayesian Stats for the Speech & Language Sciences, with Allison Hilger and Timo Roettger: https://learnbayesstats.com/episode/40-bayesian-stats-speech-language-sciences-allison-hilger-timo-roettger/
- Bayesian Model-Building Interface in Python: https://bambinos.github.io/bambi/

**Abstract**

You have probably unknowingly already been exposed to this episode’s topic – psychometric testing – when taking a test at school or university. Our guest, Professor Jonathan Templin, tries to increase the meaningfulness of these tests by improving the underlying psychometric models, the bayesian way of course!

Jonathan explains that it is not easy to judge the ability of a student based on exams since they have errors and are only a snapshot. Bayesian statistics helps by naturally propagating this uncertainty to the results.

In the field of psychometric testing, Marginal Maximum Likelihood is commonly used. This approach quickly becomes unfeasible though when trying to marginalise over multidimensional test scores. Luckily, Bayesian probabilistic sampling does not suffer from this.

A further reason to prefer Bayesian statistics is that it provides a lot of information in the posterior. Imagine taking a test that tells you what profession you should pursue at the end of high school. The field with the best fit is of course interesting, but the second best fit may be as well. The posterior distribution can provide this kind of information.

After becoming convinced that Bayes is the right choice for psychometrics, we also talk about practical challenges like choosing a prior for the covariance in a multivariate normal distribution, model selection procedures and more.

In the end we learn about a great Bayesian holiday destination, so make sure to listen till the end!

**Transcript**

*This is an automatic transcript and may therefore contain errors. Please **get in touch** if you’re willing to correct them.*

##### Transcript

In this episode, Jonathan Templin,

professor of Psychological and

2

Quantitative Foundations at the University

of Iowa, shares insight into his journey

3

in the world of psychometrics.

4

Jonathan's research focuses on diagnostic

classification models, psychometric models

5

that seek to provide multiple reliable

scores from educational and psychological

6

assessment.

7

He also studies patient statistics as

applied in psychometrics, broadly.

8

So naturally, we discussed the

significance of psychometrics in

9

psychological sciences and how Bayesian

methods are helpful in this field.

10

We also talked about challenges in

choosing appropriate prior distributions,

11

best practices for model comparison, and

how you can use the multivariate normal

12

distribution to infer the correlations

between the predictors of your linear

13

progressions.

14

This is a deep, reaching conversation that

15

concludes with the future of Bayesian

statistics in Psychological, Educational,

16

and Social Sciences.

17

Hope you'll enjoy it.

18

This is Learning Bayesian Statistics,

episode 94, recorded September 11, 2023.

19

Hello, my dear Bayesians!

20

This time, I have the pleasure to welcome

three new members to our Bayesian crew,

21

Bart Trudeau, Noes Fonseca, and Dante

Gates.

22

Thank you so much for your support, folks.

23

It's the main way this podcast gets

funded.

24

And Bart and Dante, get ready to receive

your exclusive merch in the coming month.

25

Send me a picture, of course.

26

Now let's talk psychometrics and modeling

with Jonathan Templin.

27

Jonathan Templin, welcome to learning

patient statistics.

28

Thank you for having me.

29

It's a pleasure to be here.

30

Yeah, thanks a lot.

31

Quite a few patrons have mentioned you in

the Slack of the show.

32

So I'm very honored to honor their request

and have you on the show.

33

And actually thank you folks for bringing

me all of those suggestions and allowing

34

me to discover so many good patients out

there in the world doing awesome things in

35

a lot of different fields using our.

36

favorite tools to all of us based in

statistics.

37

So Jonathan, before talking about all of

those good things, let's dive into your

38

origin story.

39

How did you come to the world of

psychometrics and psychological sciences

40

and how sinuous of a path was it?

41

That's a good question.

42

So I was an odd student, I dropped out of

high school.

43

So I started my...

44

college degree and community college, that

would be the only place that would take

45

me.

46

I happened to be really lucky to do that

though, because I had some really great

47

professors and I took a, once I discovered

that I probably could do school, I took a

48

statistics course, you know, typical

undergraduate basic statistics.

49

I found that I loved it.

50

I decided that I wanted to do something

with statistics and then in the process, I

51

took a research methods class in

psychology and I decided somehow I wanted

52

to do statistics in psychology.

53

So moved on from community college, went

to my undergraduate for two years at

54

Sacramento state and Sacramento,

California also was really lucky because I

55

had professor there that said, Hey,

there's this field called quantitative

56

psychology.

57

You should look into it.

58

If you're interested in statistics and

psychology along the same time, he was

59

teaching me something called factor

analysis.

60

I now look at it as more principal

components analysis, but I wanted to know

61

what was happening underneath the hood of

factor analysis.

62

And so that's where he said, no, really,

you should go to the graduate school for

63

that.

64

And so that's what started me.

65

I was fortunate enough to be able to go to

the University of Illinois for graduate

66

studies.

67

I did a master's, a PhD there, and in the

process, that's where I learned all about

68

Bates.

69

So it was a really lucky route, but it all

wouldn't have happened if I didn't go to

70

community college, so I'm really proud to

say I'm a community college graduate, if

71

you will.

72

Yeah.

73

Nice.

74

Yeah.

75

So it kind of happened.

76

somewhat easily in a way, right?

77

Good meeting at the right time and boom.

78

That's right.

79

And the call of the eigenvalue is what

really sent me to graduate school.

80

So I wanted to figure out what that was

about.

81

Yes, that is a good point.

82

And so nowadays,

83

What are you doing?

84

How would you define the work you're doing

and what are the topics that you are

85

particularly interested in?

86

I would put my work into the field of item

response theory, largely.

87

I do a lot of multidimensional item

response theory.

88

There are derivative fields I think I'm

probably most known for, one of which is

89

something called cognitive diagnosis or

diagnostic classification modeling.

90

Basically, it's a classification based

method to try to...

91

Classify students, or I work in the

College of Education, so most of this is

92

applied to educational data from

assessments, and our goal is to, whenever

93

you take a test, not just give you one

score, give you multiple valid scores, try

94

to maximize the information we can give

you.

95

My particular focus these days is in doing

so in classroom-based assessments, so how

96

do we understand what a student knows at a

given point in the academic year and try

97

to help make sure that they make the most

progress they can.

98

Not.

99

to remove the impact of the teacher

actually to provide the teacher with the

100

best data to work with the child, to work

with the parents, to try to move forward.

101

But all that boils down to interesting

measurements, psychometric issues, and

102

interesting ways that we look at test data

that come out of classrooms.

103

Okay.

104

Yeah, that sounds fascinating.

105

Basically trying to give a distribution of

results instead of just one point

106

estimate.

107

That's it also and tests have a lot of

error.

108

So making sure that we don't over deliver

when we have a test score.

109

Basically understanding what that is and

accurately quantifying how much

110

measurement error is or lack of

reliability there is in the score itself.

111

Yeah, that's fascinating.

112

I mean, we can already dive into that.

113

I have a lot of questions for you, but it

sounds very interesting.

114

So yeah.

115

So what does it look like concretely?

116

these measurement errors and the test

scores attached to them, and basically how

117

do you try to solve that?

118

Maybe you can take an example from your

work where you are trying to do that.

119

Absolutely.

120

Let me start with the classical example.

121

If this is too much information, I

apologize.

122

But to set the stage, for a long time in

item response theory, we understand that a

123

person's...

124

Latentability estimate, if you want to

call it that, is applied in education.

125

So this latent variable that represents

what a person knows, it's put onto the

126

continuum where items are.

127

So basically items and people are sort of

ordered.

128

However, the properties of the model are

such that how much error there might be in

129

a person's point estimate of their score

depends on where the score is located on

130

the continuum.

131

So this is what, you know, theory gave

rise to, you know, theory in the 1970s

132

gave rise to our modern computerized

adaptive assessments and so forth, that

133

sort of pick an item that would minimize

the error, if you will, different ways of

134

describing what we pick an item for.

135

But that's basically the idea.

136

And so from a perspective of where I'm at

with what I do, a complicating factor in

137

this, so that architecture that I just

mentioned that

138

historic version of adaptive assessments

that really been built on large scale

139

measures.

140

So thousands of students and really what

happens in a classical census you would

141

take a marginal maximum likelihood

estimate of certain parameter values from

142

the model.

143

You'd fix those values as if you knew them

with certainty and then you would go and

144

estimate a person's parameter value along

with their standard error conditional

145

standard error measurement.

146

The situations I work in don't have large

sample size but we all in addition to

147

a problem with sort of the asthmatotic

convergence, if you will, of those models,

148

we also have a, not only we have not have

large sample sizes, we also have multiple,

149

multiple scores effectively, multiple

latent freqs that we can't possibly do.

150

So when you look at the same problem from

a Bayesian lens, sort of an interesting

151

feature happens that we don't often see,

you know,

152

frequentness or a classical framework in

that process of fixing the parameters of

153

the model, the item parameters to a value,

you know, disregards any error in the

154

estimate as well.

155

Whereas if you're in a simultaneous

estimate, for instance, in a markup chain

156

where you're sampling these values from a

posterior in addition to sampling

157

students, it turns out those that error

around those parameters can propagate to

158

the students and provide a wider interval

around them, which I think is a bit more

159

accurate, particularly in smaller sample

size.

160

situation.

161

So I hope that's the answer to your

question.

162

I may have taken a path that might have

been a little different there, but that's

163

where I see the value at least in using

Bayesian statistics and what I do.

164

Yeah, no, I love it.

165

Don't shy away from technical explanation

on these podcasts.

166

That's the good thing of the podcast.

167

Don't have to shy away from it.

168

It came at a good time.

169

I've been working on this, some problems

like this all day, so I'm probably in the

170

weeds a little bit.

171

Forgive me if I go at the deep end of it.

172

No, that's great.

173

And we already mentioned item response

theory on the show.

174

So hopefully people will refer back to

these episodes and that will give them a

175

heads up.

176

Well, actually you mentioned it, but do

you remember how you first got introduced

177

to Bayesian methods and why did they stick

with you?

178

Very, very much.

179

I was introduced because in graduate

school, I had the opportunity to work for

180

a lab run by Bill Stout at the University

of Illinois with other very notable people

181

in my career, at least Jeff Douglas, Louis

Roussos, among others.

182

And I was hired as a graduate research

assistant.

183

And my job was to take a program that was

a metropolis Hastings algorithm and to

184

make it run.

185

And it was written in Fortran.

186

So basically, I

187

It was Metropolis Hastings, Bayesian, and

it was written in language that I didn't

188

know with methods I didn't know.

189

And so I was hired and said, yeah, figure

it out with good luck.

190

Thankfully, I had colleagues that could

help actually probably figure it out more

191

than I did.

192

But I was very fortunate to be there

because it's like a trial by fire.

193

I was basically going line by line through

that.

194

This was a little bit in the later part

of, I think it was the year 2001, maybe a

195

little early 2002.

196

But something instrumental to me at the

time were a couple papers by a couple

197

scholars in education at least, Rich Pates

and Brian Junker had a paper in 1999,

198

actually two papers in 1999, I can even,

you know, it's like Journal of Educational

199

Behavioral Statistics.

200

It's like I have that memorized.

201

But in their algorithm, they had written

down the algorithm itself and it was a

202

matter of translating that to the

diagnostic models that we were working on.

203

But that is why it stuck with me because

it was my job, but then it was also

204

incredibly interesting.

205

It was not like a lot of the research that

I was reading and not like a lot of the

206

work I was doing in a lot of the classes I

was in.

207

So I found it really mentally stimulating,

entirely challenging.

208

It took the whole of my brain to figure

out.

209

And even then I don't know that I figured

it out.

210

So that helps answer that question.

211

Yeah.

212

So basically it sounds like you were

thrown into the Beijing pool.

213

Like you didn't have any choice.

214

I was.

215

When I was Bayesian, it was nice because

at the time, you know, this is 2001, 2002,

216

in education, no measurement in

psychology.

217

You know, we knew of Bayes certainly, you

know, there's some great papers from the

218

nineties that were around, but, you know,

we weren't, it wasn't prominent.

219

It wasn't, you know, I was in graduate

school, but at the same time I wasn't

220

learning it, I mean, I knew the textbook

Bayes, like the introductory Bayes, but

221

not, definitely not.

222

Like the estimation side.

223

And so it was timing wise, you know,

people would look back now and say, okay,

224

why didn't I go grab Stan or grab, at the

time I think we had, Jets didn't exist,

225

there was bugs.

226

And it was basically, you have to, you

know, like roll your own to do anything.

227

So it was, it was good.

228

No, for sure.

229

Like, yeah, no, it's like telling, it's

like asking Christopher Columbus or

230

That's right.

231

It's a lot more direct.

232

Just hop on the plane and...

233

Wasn't an option.

234

Exactly.

235

Good point.

236

But actually nowadays, what are you using?

237

Are you still doing your own sampler like

that in Fortran or are you using some open

238

source software?

239

I can hopefully say I retired from Fortran

as much as possible.

240

Most of what I do is install these days a

little bit of JAGS, but then occasionally

241

I will...

242

trying to write my own here or there.

243

The latter part I'd love to do more of,

because you can get a little highly

244

specialized.

245

I just like that, I feel like the time to

really deeply do the development work in a

246

way that doesn't just have an R package or

some package in Python that would just

247

break all the time.

248

So I'm sort of stuck right now with that,

but it is something that I'm grateful for

249

having the contributions of others to be

able to rely upon to do estimation.

250

Sorry.

251

Yeah, no, exactly.

252

I mean,

253

So first, Stan, I've heard he's quite

good.

254

Of course, it's amazing.

255

A lot of Stan developers have been on this

show, and they do absolutely tremendous

256

work.

257

And yeah, as you were saying, why code

your own sampler when you can rely on

258

samplers that are actually waterproof,

that are developed by a bunch of very

259

smart people who do a lot of math.

260

and who do all the heavy lifting for you,

well, just do that.

261

And thanks to that, Bayesian computing and

statistics are much more accessible

262

because you don't have to actually know

how to code your own MCMC sampler to do

263

it.

264

You can stand on the shoulders of giants

and just use that and superpower your own

265

analysis.

266

So it's definitely something we tell

people, don't code your own samplers now.

267

You don't need to do that unless you

really, really have to do it.

268

But usually, when you have to do that, you

know what you're doing.

269

Otherwise, people have figured that out

for you.

270

Just use the automatic samplers from Stan

or Pimsy or Numpyro or whatever you're

271

using.

272

It's usually extremely robust and checked

by a lot of different pairs of eyes and

273

keyboards.

274

having that team and like you said, full

of people who are experts in not only just

275

mathematics, but also computer science

makes a big difference.

276

Yeah.

277

I mean, I would not be able to use patient

statistics nowadays if these samplers

278

didn't exist, right?

279

Because I'm not a mathematician.

280

So if I had to write my own sample each

time, I would just be discouraged even

281

before starting.

282

Yeah.

283

It's just a challenge in and of itself.

284

I remember the old days where

285

That would be it.

286

That's my dissertation.

287

That was what I had to do.

288

So it was like six months work on just the

sampler.

289

And even then it wasn't very good.

290

And then they might actually do the

studying.

291

Yeah, exactly.

292

Yeah.

293

I mean, to me really that probabilistic

programming is one of the super power of

294

the Beijing community because that really

allows.

295

almost anybody who can code in R or Python

or Julia to just use what's being done by

296

very competent and smart people and for

free.

297

Right.

298

Yeah.

299

Also true.

300

Yeah.

301

What a great community.

302

I'm really, really impressed with the size

and the scope and how things have

303

progressed in just 20 years.

304

It's really something.

305

Yeah.

306

Exactly.

307

And so actually...

308

Do you know why, well, do you have an idea

why Bayesian statistics is useful in your

309

field?

310

What do they bring that you don't get with

the classical framework?

311

Yeah, in particular, we have a really

nasty...

312

If we were to do a classical framework,

typically the gold standard in...

313

the field I work in is sort of a marginal

maximum likelihood.

314

The marginal mean we get rid of the latent

variable to estimate models.

315

So that process of marginalization is done

numerically.

316

We numerically integrate across likelihood

function.

317

Most cases, there are some special case

models that we really are too simplistic

318

to use for what we do where we don't have

it.

319

So if we want to do multidimensional

versions

320

If you think about numeric integration,

for one dimension you have this sort of

321

discretized set of a likelihood to take

sums across different, what we call

322

quadrature points of some type of curve.

323

For the multidimensional sense now, going

from one to two, you effectively squared

324

the number of points you have.

325

So that's just too latent variable.

326

So if you want two bits of information

from an assessment from somebody, now

327

you've just made your

328

marginalization process exponentially more

difficult, more time-consuming.

329

But really, the benefit of having two

scores is very little compared to having

330

one.

331

So if we wanted to do five or six or 300

scores, that marginalization process

332

becomes really difficult.

333

So from a brute force perspective, if we

take the a Bayesian sampler perspective,

334

there is not the exponential increase of

computation in the linear increase in the

335

latent variables.

336

And so from a number of steps the process

has to take from calculation is much

337

smaller.

338

Now, of course, Markov chains have a lot

of calculations.

339

So, you know, maybe overall the process is

longer, but it is, I found it to be

340

necessity, basing statistics to estimate

in some form shows up in this

341

multidimensional likelihood, basically

evaluation.

342

created sort of hybrid versions of EM

algorithms where the E-step is replaced

343

with the Bayesian type method.

344

But for me, I like the full Bayesian

approach to everything.

345

So I would say that just in summary

though, what Bayes brings from a brute

346

force perspective is the ability to

estimate our models in a reasonable amount

347

of time with a reasonable amount of

computations.

348

There's the added benefit of what I

mentioned previously, which

349

which is the small sample size, sort of

the, I think, a proper accounting or

350

allowing of error to propagate in the

right way if you're going to report scores

351

and so forth, I think that's an added

benefit.

352

But from a primary perspective, I'm here

because I have a really tough integral to

353

solve and Bayes helps me get around it.

354

Yeah, that's a good point.

355

And yeah, like as you were saying, I'm

guessing that having priors

356

And generative modeling helps for low

sample sizes, which tends to be the case a

357

lot in your field.

358

Also true.

359

Yeah.

360

The prior distributions can help.

361

A lot of the frustration with

multidimensional models and psychometrics,

362

at least in practical sense.

363

You get a set of data, you think it's

multidimensional.

364

The next process is to estimate a model.

365

in the classic sense that those models

sometimes would fail to converge.

366

Uh, and very little reason why, um,

oftentimes it's failed to emerge.

367

I had a class I taught four or five years

ago where I just asked people to estimate

368

five dimensions and not a single person

couldn't could get, I had a set of data

369

for each person.

370

Not a single person could get it in

marriage with the default options that

371

you'd see that like an IRT package.

372

Um, so having the ability to sort of.

373

Understand potentially where

non-convergence or why that's happening,

374

which parameters are finding a difficult

spot.

375

Then using priors to sort of aid an

estimation as one part, but then also sort

376

of the idea of the Bayesian updating.

377

If you're trying to understand what a

student knows throughout the year,

378

Bayesian updating is perfect for such

things.

379

You know, you can assess a student in

November and update their results that you

380

have potentially from previous parts in

the year as well, too.

381

So there's a lot of benefits.

382

I guess I could keep going.

383

I'm talking to a BASE podcast, so probably

I already know most of it.

384

Yeah.

385

I mean, a lot of people are also listening

to understand what BASE is all about and

386

how that could help them in their own

field.

387

So that's definitely useful if we have

some psychometricians in the audience who

388

haven't tried yet some BASE, well, I'm

guessing that would be useful for them.

389

And actually, could you share an example?

390

If you have one of a research project

where BASE and stats played a

391

a crucial role, ideally in uncovering

insights that might have been missed

392

otherwise, especially using traditional

stats approaches?

393

Yeah, I mean, just honestly, a lot of what

we do just estimating the model itself, it

394

sounds like it should be trivial.

395

But to do so with a full information

likelihood function is so difficult.

396

I would say almost every single analysis

I've done using a multidimensional

397

has been made possible because of the

Bayesian analyses themselves.

398

Again, there are shortcut methods you

would call that.

399

I think there are good methods, but again,

there are people, like I mentioned, that

400

sort of a hybrid marginal maximum

likelihood.

401

There's what we would call limited

information approaches that you might see

402

in programs like M plus, or there's an R

package named Laban that do such things.

403

But those only use functions of the data,

not the full data themselves.

404

I mean, it's still good, but it's sort of

I have this sense that the full likelihood

405

is what we should be using.

406

So to me, just a simple example, take a, I

was working this morning with a four

407

dimensional assessment, an assessment, you

know, 20 item test, kids in schools.

408

And you know, I would have a difficult

time trying to estimate that with a full

409

maximum likelihood method.

410

And so Bayes made that possible.

411

But beyond that, if we ever want to do

something with the test scores afterwards,

412

right?

413

So now we have a bunch of Markov chains of

people's scores themselves.

414

This makes it easy to be able to then not

forget that these scores are not measured

415

perfectly.

416

And take a posterior distribution and use

that in a secondary analysis as well, too.

417

So I was doing some work with one of the

Persian Gulf states where they were trying

418

to

419

like a vocational interest survey.

420

And some of the classical methods for

this, sort of they disregarded any error

421

whatsoever.

422

And they basically said, oh, you're

interested in, I don't know, artistic work

423

or you know, numeric work of some sort.

424

And they would just tell you, oh, that's

it.

425

That's your story.

426

Like, I don't know if you've ever taken

one of those.

427

What are you gonna do in a career?

428

You're in a high school student and you're

trying to figure this out.

429

But if you propagate, if you allow that

error to sort of propagate,

430

through the way Bayesian methods make it

very easy to do, you'll see that while

431

that may be the most likely choice of what

you're interested in or what your sort of

432

dimensions that may be most salient to you

in your interests, there are many other

433

choices that may even be close to that as

well.

434

And that would be informative as well too.

435

So we sort of forget, we sort of overstate

how certain we are in results.

436

And I think a lot of the Bayesian methods

built around it.

437

So

438

That was one actually project where I did

write the own algorithm for it to try to

439

estimate these things because it was just

a little more streamlined.

440

But it seemed it seemed that would rather

than telling a high school student, hey,

441

you're best at artistic things.

442

What we could say is, hey, yeah, you may

be best at artistic, but really close to

443

that is something that's numeric, you

know, like something along those lines.

444

So while you're strong at art.

445

You're really strong at math too.

446

Maybe you should consider one of these two

rather than just go down a path that may

447

or may not really reflect your interests.

448

Hope that's a good example.

449

Yeah.

450

Yeah, definitely.

451

Yeah, thanks.

452

And I understand how that would be useful

for sure.

453

And how does, I'm curious about the role

of priors in all that, because that's

454

often something that puzzles beginners.

455

And so you obviously have a lot of

experience in the Bayesian way of life in

456

your field.

457

So I'm curious, I'm guessing that you kind

of teach the way to do psychometric

458

analysis in the Bayesian framework to a

lot of people.

459

And I'm curious, especially on the prior

side, and if there are other interesting

460

things that you would like to share on

that, feel free.

461

My question is on the priors.

462

How do you approach the challenge of

choosing appropriate prior distributions,

463

especially when you're dealing with

complex models?

464

Great question.

465

And I'm sure each field does it a little

bit differently.

466

I mean, as it probably should, because

each field has its own data and models and

467

already established scientific knowledge.

468

So that's my way of saying.

469

This is my approach.

470

I'm 100% confident that it's the approach

that everybody should take.

471

But let me back it up a little bit.

472

So generally speaking, I teach a lot of

students who are going into, um, many of

473

our students end up in the industry for

educational measurement here in the United

474

States.

475

Um, I like, we usually denote our score

parameters with theta.

476

I like to go around saying that, yeah, I'm

teaching you to have to sell

477

That's sort of what they do, you know, in

a lot of these industry settings, they're

478

selling test scores.

479

So if you think that that's what you're

trying to do, I think that guides to me a

480

set of prior choices that try to do the

least amount of speculation.

481

So what I mean by that.

482

So if you look at a measurement model,

like an item response model, you know,

483

there's a set of parameters to it.

484

One parameter in particular, in item

response theory, we call it the

485

discrimination parameter or

486

Factor analysis, we call it factor

loading, and linear regression, it would

487

be a slope.

488

This parameter tends to govern the extent

to which an item relates to the latent

489

variable.

490

So the higher that parameter is, the more

that item relates.

491

Then when we go and do a Bayes theorem to

get a point estimate of a person's score

492

or a posterior distribution of that

person's score, the contribution of that

493

item.

494

is largely reflected by the magnitude of

that parameter.

495

The higher the parameter that is, the more

that item has weight on that distribution,

496

the more we think we know about a person.

497

So in doing that, when I look at setting

prior choices, what I try to do for that

498

is to set a prior that would be toward

zero, mainly, actually at zero mostly, try

499

to set it so that we want our data to tell

more of the job than our prior,

500

particularly if we're trying to, if this

score has a big,

501

uh, meaning to somebody you think of, um,

well, in the United States, the assessment

502

culture is a little bit out of control,

but, you know, we have to take tests to go

503

to college.

504

We have to take tests to go to graduate

school and so forth.

505

Uh, then of course, if you go and work in

certain industries, there's assessments to

506

do licensure, right?

507

So if you, you know, for instance, my

family is a, I come from that family of

508

nurses, uh, it's a very noble profession,

but to, to be licensed in a nurse in

509

California, you have to pass an exam.

510

provide that score for the exam that we're

not, that score reflects as much of the

511

data as possible unless a prior choice.

512

And so there are ways that, you know,

people can sort of use priors, they're

513

sort of not necessarily empirical science

benefit, you can sort of put too much

514

subjective weight onto it.

515

So when I talk about priors, when I talk

about the, I try to talk about the

516

ramifications of the choice of prior on

certain parameters, that discrimination

517

parameter or slope, I tend to want

518

to have the data to force it to be further

away from zero because then I'm being more

519

conservative, I feel like.

520

The rest of the parameters, I tend to not

use heavy priors on what I do.

521

I tend to use some very uninformative

priors unless I have to.

522

And then the most complicated prior for

what we do, and the one that's caused

523

historically the biggest challenge,

although it's, I think, relatively in good

524

place these days thanks to research and

science, is the prior that goes on a

525

covariance or correlation matrix.

526

That had been incredibly difficult to try

to estimate back in the day.

527

But now things are much, much easier in

modern computing, in modern ways of

528

looking, modern priors actually.

529

Yeah, interesting.

530

Would you like to walk us a bit through

that?

531

What are you using these days on priors on

correlation or covariance matrices?

532

Because, yeah, I do teach those also

because...

533

I love it.

534

Basically, if you're using, for instance,

a linear regression and want to estimate

535

not only the correlation of the

parameters, the predictors on the outcome,

536

but also the correlation between the

predictors themselves and then using that

537

additional information to make even better

prediction on the outcome, you would, for

538

instance, use a multivariate normal on the

parameters on your slopes.

539

of your linear regression, for instance,

what primaries do you use on that

540

multivariate?

541

What does the multivariate normal mean?

542

And a multivariate normal needs a

covariance matrix.

543

So what primaries do you use on the

covariance matrix?

544

So that's basically the context for

people.

545

Now, John, basically try and take it from

there.

546

What are you using in your field these

days?

547

Yeah, so going with your example, I have

no idea.

548

You know, like, if you have a set of

regression coefficients that you say are

549

multivariate normal, yes, there is a place

for a covariance in the prior.

550

I never try to speculate what that is.

551

I don't think I have, like, the human

judgment that it takes to figure out what

552

the, like, the belief, your prior belief

is for that.

553

I think you're talking about what would be

analogous to sort of the, like, the

554

asthmatotic covariance matrix.

555

The posterior distribution of these

parameters where you look at the

556

covariance between them is like the

asymptotic covariance matrix in ML, and we

557

just rarely ever speculate off of the

diagonal, it seems like, on that.

558

I mean, there are certainly uses for

linear combinations and whatnot, but

559

that's tough.

560

I'm more thinking about, like, when I have

a handful of latent variables and try to

561

estimate, now the problem is I need a

covariance matrix between them, and

562

they're likely to be highly correlated,

right?

563

So...

564

In our field, we tend to see correlations

of psychological variables that are 0.7,

565

0.8, 0.9.

566

These are all academic skills in my field

that are coming from the same brain.

567

The child has a lot of reasons why those

are going to be highly correlated.

568

And so these days, I love the LKJ prior

for it.

569

It makes it easy to put a prior on a

covariance matrix and then if you want to

570

rescale it.

571

That's one of the other weird features of

the psychometric world is that because

572

these variables don't exist, to estimate

covariance matrix, we'd have to make

573

certain constraints on the, on some of the

item parameters, the measurement model for

574

instance.

575

If we want a variance of the factor, we

have to set one of the parameters of the

576

discrimination parameters to a value to be

able to estimate it.

577

Otherwise, it's not identified.

578

work that we talk about for calibration

when we're trying to build scores or build

579

assessments and their data for it, we fix

that value of the variance of a factor to

580

one.

581

We standardize the factor zero, meaning

variance one, very simple idea.

582

The models are equivalent in a classic

sense, in that the likelihoods are

583

equivalent, whether we do one way or the

other.

584

When we put products on the posteriors

aren't entirely equivalent, but that's a

585

matter of a typical Bayesian issue with

transformations.

586

But

587

In the sense where we want a correlation

matrix, prior to the LKJ, prior, there

588

were all these sort of, one of my mentors,

Rod McDonald, called devices, little hacks

589

or tricks that we would do to sort of keep

covariance matrix, sample it, right?

590

I mean, you think about statistically to

sample it, I like a lot of rejection

591

sampling methods.

592

So if you were to basically propose a

covariance or correlation matrix, it has

593

to be positive.

594

semi-definite, that's a hard term.

595

It has to be, you have to make sure that

the correlation is bounded and so forth.

596

But LKJ takes care of almost all of that

for me in a way that allows me to just

597

model the straight correlation matrix,

which has really made life a lot easier

598

when it comes to estimation.

599

Yeah, I mean, I'm not surprised that does.

600

I mean, that is also the kind of priors I

tend to use personally and that I teach

601

also.

602

In this example, for instance, of the

linear regression, that's what I probably

603

end up using LKJPrior on the predictors on

the slopes of the linear regression.

604

And for people who don't know,

605

Never used LKJ prior.

606

LKJ is decomposition of the covariance

matrix.

607

That way, we can basically sample it.

608

Otherwise, it's extremely hard to sample

from a covariance matrix.

609

But the LKJ decomposition of the matrix is

a way to basically an algebraic trick.

610

that makes use of the Cholesky

decomposition of a covariance matrix that

611

allows us to sample the Cholesky

decomposition instead of the covariance

612

matrix fully, and that helps the sampling.

613

Thank you.

614

Thank you for putting that out there.

615

I'm glad you put that on.

616

Yeah, so yeah.

617

And basically, the way you would

parametrize that, for instance, in Poem C,

618

you would

619

use pm.lkj, and basically you would have

to parameterize that with at least three

620

parameters, the number of dimensions.

621

So for instance, if you have three

predictors, that would be n equals 3.

622

The standard deviation that you are

expecting on the predictors on the slopes

623

of the linear regression, so that's

something you're used to, right?

624

If you're using a normal prior on the

slope, then the sigma of the slope is just

625

standard deviation that you're expecting

on that effect for your data and model.

626

And then you have to specify a prior on

the correlation of these slopes.

627

And that's where you get into the

covariance part.

628

And so basically, you can specify a prior.

629

So that would be called eta in PIME-Z on

the LKJ prior.

630

And the

631

bigger eta, the more suspicious of high

correlations your prior would be.

632

So if eta equals 1, you're basically

expecting a uniform distribution of

633

correlations.

634

That could be minus 1, that could be 1,

that could be 0.

635

All of those have the same weight.

636

And then if you go to eta equals 8, for

instance, you would put much more prior

637

weight on correlations eta.

638

Close to zero, much of them will be close

to zero in 0.5 minus 0.5, but it would be

639

very suspicious of very big correlations,

which I guess would make a lot of sense,

640

for instance, social science.

641

I don't know in your field, but yeah.

642

I typically use the uniform, the one

setting, at least to start with, but yeah,

643

I think that's a great description.

644

Very good description.

645

Yeah, I really love these kinds of models

because they make linear regression even

646

more powerful.

647

To me, linear regression is so powerful

and very underrated.

648

You can go so far with plain linear

regression and often it's hard to really

649

do better.

650

You have to work a lot to do better than a

really good linear regression.

651

I completely agree with you.

652

Yeah, I'm 100% right there.

653

And actually then you get into sort of

the...

654

quadratic or the nonlinear forms in linear

regression that map onto it that make it

655

even more powerful.

656

So yeah, it's absolutely wonderful.

657

Yeah, yeah.

658

And I mean, as Spider-Man's uncle said,

great power comes with great

659

responsibility.

660

So you have to be very careful about the

priors when you have all those features,

661

so inversing functions because they

662

the parameter space, but same thing, well,

if you're using a multivariate normal, I

663

mean, that's more complex.

664

So of course you have to think a bit more

about your model structure, about your

665

prior.

666

And also the more structure you add, if

the size of the data is kept equal, well,

667

that means you have more risk for

overfitting and you have less informative

668

power per data point.

669

Let's say so.

670

That means the prior.

671

increase in importance, so you have to

think about them more.

672

But you get a much more powerful model

after once and the goal is to get much

673

more powerful predictions after once.

674

I do agree.

675

These weapons are hard to wield.

676

They require time and effort.

677

And on my end, I don't know for you.

678

Jonathan, but on my end, they also require

a lot of caffeine from time to time.

679

Maybe.

680

Yeah.

681

I mean, so that's the key.

682

You see how I did the segue.

683

I should have a podcast.

684

Yeah.

685

So as a first time I do that in the

podcast, but I had that.

686

Yeah.

687

So I'm a big coffee drinker.

688

I love coffee.

689

I'm a big coffee nerd.

690

But from time to time, I try to decrease

my caffeine usage, you know, also because

691

you have some habituation effects.

692

So if I want to keep the caffeine shot

effect, well, I have to sometimes do a

693

decrease of my usage.

694

And funnily enough, when I was thinking

about that, a small company called Magic

695

Mind, they came to me...

696

They sent me an email and they listened to

the show and they were like, hey, you've

697

got a cool show.

698

I would be happy to send you some bottles

for you to try and to talk about it on the

699

show.

700

And I thought that was fun.

701

So I got some Magic Mind myself.

702

I drank it, but I'm not going to buy

Jonathan because I got Magic Mind to send

703

some samples to Jonathan.

704

And if you are watching the YouTube video,

Jonathan is going to try the Magic Mind

705

right now, live.

706

So yeah, take it away, Jon.

707

Yeah, this is interesting because you

reached out to me for the podcast and I

708

had not met you, but you know, it's a

conversation, it's a podcast, you have to

709

do great work.

710

Yes, I'll say yes to that.

711

Then you said, how would you like to try

the Magic Mind?

712

And I thought...

713

being a psych major as an undergraduate,

this is an interesting social psychology

714

experiment where a random person from the

internet says, hey, I'll send you

715

something.

716

So I thought there's a little bit of

safety in that by drinking it in front of

717

you while we're talking on the podcast.

718

But of course, I know you can cut this out

if I hit the floor, but here it comes.

719

So you're drinking it like, sure.

720

Yeah, I decided to drink it like a shot,

if you will.

721

It was actually tasted much better than I

expected.

722

It came in a bottle with green.

723

It tasted tangy, so very good.

724

And now the question will be, if I get

better at my answers to your questions by

725

the end of the podcast, therefore we have

now a nice experiment.

726

But no, I noticed it has a bit of

caffeine, certainly less than a cup of

727

coffee.

728

But at the same time, it doesn't seem

offensive whatsoever.

729

Yeah, that's pretty good.

730

Yeah, I mean, I'm still drinking caffeine,

if that's all right.

731

But yeah, from time to time, I like to

drink it.

732

My habituation, my answer to that is just

drink more.

733

That's fine.

734

Yeah, exactly.

735

Oh yeah, and decaf and stuff like that.

736

But yeah, I love the idea of the product

is cool.

737

I liked it.

738

So I was like, yeah, I'm going to give it

a shot.

739

And so the way I drank it was also

basically making myself a latte

740

coffee, I would use the Magic Pint and

then I would put my milk in the milk foam.

741

And that is really good.

742

I have to say.

743

See how that works.

744

Yeah.

745

So it's based on, I mean, the thing you

taste most is the matcha, I think.

746

And usually I'm not a big fan of matcha

and that's why I give it the green color.

747

I think usually I'm not, but I had to say,

I really appreciated that.

748

You and me both, I was feeling the same

way.

749

When I saw it come in the mail, I was

like, ooh, that added to my skepticism,

750

right?

751

I'm trying to be a good scientist.

752

I'm trying to be like, yeah.

753

But yeah, it was actually surprisingly,

tasted more like a juice, like a citrus

754

juice than it was matcha.

755

So it was much nicer than I expected.

756

Yeah, I love that because me too, I'm

obviously extremely skeptical about all

757

those stuff.

758

So.

759

I like doing that.

760

It's way better, way more fun to do it

with you or any other nerd from the

761

community than doing it with normal people

from the street because I'm way too

762

skeptical for them.

763

They wouldn't even understand my

skepticism.

764

I agree.

765

I felt like in a scientific community,

I've seen some of the people you've had on

766

the podcast, we're all a little bit

skeptical about what we do.

767

I could bring that skepticism here and I'd

feel like at home, hopefully.

768

I'm glad that you allowed me to do that.

769

Yeah.

770

And that's the way of life.

771

Thanks for trusting me because I agree

that seeing from a third party observer,

772

you'd be like, that sounds like a scam.

773

That guy is just inviting me to sell him

something to me.

774

In a week, he's going to send me an email

to tell me he's got some financial

775

troubles and I have to wire him $10,000.

776

Waiting for that or is it, what level of

paranoia do I have this morning?

777

I was like, well, who are my enemies and

who really wants to do something bad to

778

me?

779

Right?

780

So, I don't believe I'm at that level.

781

So I don't think I have anything to worry

about.

782

It seems like a reputable company.

783

So it was, it was amazing.

784

Yeah.

785

No, that was good.

786

Thanks a lot MagicMine for sending me

those samples, that was really fun.

787

Feel free to give it a try, other people

if you want, if that sounded like

788

something you'd be interested in.

789

And if you have any other product to send

me, send them to me, I mean, that sounds

790

fun.

791

I mean, I'm not gonna say yes to

everything, you know, I have standards on

792

the show, and especially scientific

standards.

793

But you can always send me something.

794

And I will always analyze it.

795

You know, somehow you can work out an

agreement with the World Cup, right?

796

Some World Cup tickets for the next time.

797

True.

798

That would be nice.

799

True.

800

Yeah, exactly.

801

Awesome.

802

Well, what we did is actually kind of

related, I think, I would say to the

803

other, another aspect of your work.

804

And that is model comparison.

805

So, and it's again, a topic that's asked a

lot by students.

806

Especially when they come from the

classical machine learning framework where

807

model comparison is just everywhere.

808

So often they ask how they can do that in

the Bayesian framework.

809

Again, as usual, I am always skeptical

about just doing model comparison and just

810

picking your model based on some one

statistic.

811

I always say there is no magic one

matching bullet, you know, in the Bayesian

812

framework where it's just, okay, model

comparisons say that, so for sure.

813

That's the best model.

814

I wouldn't say that's how it works.

815

And you would need a collection of

different indicators, including, for

816

instance, the LOO, the LOO factor, that

tells you, yeah, that model is better.

817

But not only that, what about the

posterior predictions?

818

What about the model structure?

819

What about the priors?

820

What about just the generative story about

the model?

821

But talking about model comparison, what

can you tell us, John, about the

822

some best practices for carrying out

effective model comparisons?

823

Kajen is best practice.

824

I'll just give you what my practice is.

825

I will make no claim that it's best.

826

It's difficult.

827

I think you hit on all the aspects of it

in introducing the topic.

828

If you have a set of models that you're

considering, the first thing I'd like to

829

think about is not the comparison between

them as much as how each model would fit a

830

data set of data

831

post-serial predictive model checking is,

you know, from an amazing sense is where

832

really a lot of the work for me is focused

around.

833

Interestingly, what you choose to check

against is a bit of a challenge,

834

particularly, you know, in certain fields

in psychometrics, at least the ones I'm

835

familiar with.

836

I do see a lot of, first of all, model

fit,

837

well-researched area in psychometrics in

general.

838

Really, there's millions of papers in the

1980s, maybe not millions, but it seems

839

like that many.

840

And then another, it's always been

something that people have studied.

841

I think recently there's been a resurgence

of new ideas in it as well.

842

So it's well-covered territory from the

psychometric literature.

843

It's less well-covered, at least in my

view, in Bayesian psychometrics.

844

So what I've tried to do,

845

with my work to try to see if a model fits

absolutely is to look at, there's this,

846

one of the complicating factors is that a

lot of my data is discrete.

847

So it's correct and correct scored items.

848

And in that sense, in the last 15, 20

years, there's been some good work in the

849

non-Bayesian world about how to use what

we call limited information methods to

850

assess model fit.

851

So instead of,

852

looking at model fit to the entire

contingency table.

853

So if you have a set of binary data, let's

say 10 variables that you've observed,

854

technically you have 1,024 different

probabilities that have permutations of

855

ways they could be zeros and ones.

856

And model fit should be built toward that

1,024 vector of probabilities.

857

Good luck with that, right?

858

You're not gonna collect enough data to do

that.

859

And so...

860

What a group of scientists Alberto Medeo

Alavarez, Lissai and others have created

861

are sort of model fit to lower level

contingency tables.

862

So each marginal moment of the day, each

mean effectively, and then like a two-way

863

table between all pairs of observed

variables.

864

In work that I've done with a couple of

students recently, we've tried to

865

replicate that idea, but more on a

Bayesian sentence.

866

So could we come up with

867

and M, like a statistic, this is called an

M2 statistic.

868

Could we come up with a version of a

posterior predictive check for what a

869

model says the two-way table should look

like?

870

And then similar to that, could we create

a model such that we know saturates that?

871

So for instance, if we have 10 observed

variables, we could create a model that

872

has all 10 shoes to two-way tables

estimated perfect, what we would expect to

873

be perfect.

874

Now, of course, there's posterior

distributions, but you would expect with

875

you know, plenty of data and, you know,

very diffused priors that you would get

876

point estimates, EAP estimates, and that

should be right about where you can

877

observe the frequencies of data.

878

Quick check.

879

So, um, the idea then is now we have two

models, one of which we know should fit

880

the data absolutely.

881

And one of which we know, uh, we're, we're

wondering if it fits now that the

882

comparison comes together.

883

So we have these two predictive

distributions.

884

Um, how do we compare them?

885

Uh, and that's where, you know,

886

different approaches we've taken.

887

One of those is just simply looking at the

distributional overlaps.

888

We tried to calculate a, we use the

Kilnogorov Smirnov distribution, sort of

889

the sea where moments are percent wise of

the distributions with overlap, because if

890

your model's data overlaps with what you

think that the data should look like, you

891

think the model fits well.

892

And if it doesn't, it should be far apart

and won't fit well.

893

That's how we've been trying to build.

894

It's weird because it's a model

comparison, but one of the comparing

895

models we know to be

896

what we call saturated, it should fit the

data the best and no other model, all the

897

other models should be subsumed into it.

898

So that's the approach I've taken recently

with posterior predictive checks, but then

899

a model comparison.

900

We could have used, as you mentioned, the

LOO factor or the LOO statistic.

901

And maybe that's something that we should

look into also.

902

We haven't yet, but one of my recent

graduates, new assistant professor at

903

University of Arkansas here in the United

States.

904

Ji Hang Zhang had done a lot of work on

this in his dissertation and other studies

905

here.

906

So that's sort of the approach I take.

907

The other thing I want to mention though

is when you're comparing amongst models,

908

you have to establish that model for that

absolute fit first.

909

So the way I envision this is you sort of

compare your model to this sort of

910

saturated model.

911

You do that for multiple versions of your

models and then effectively choose amongst

912

the set of models you're comparing that

sort of fit.

913

But what that absolute fit is, is like you

mentioned, it's nearly impossible to tell

914

exactly.

915

There's a number of ideas that go into

what makes a good for a good fitting

916

model.

917

Yeah.

918

And definitely I encourage people to go

take a look at the Lou paper.

919

I will put a link in the show note to that

paper.

920

And also if you're using Arvies, whether

in Julia or Python, we do have.

921

implementation of the Loo algorithm.

922

So comparing your models with obviously

extremely simple, it's just a call to

923

compare and then you can even do a plot of

that.

924

And yeah, as you were saying, the Loo

algorithm doesn't have any meaning by

925

itself.

926

Right?

927

The Loo score of a model doesn't mean

anything.

928

It's in comparison to another, to other

models.

929

So yeah, basically having a baseline model

that you think is already good enough.

930

And then all the other models have to

compare to that one, which basically could

931

be like the placebo, if you want, or the

already existing solution that there is

932

for that.

933

And then any model that's more complicated

than that should be in competition with

934

that one and should have a reason to be

used, because otherwise, why are you using

935

a more complicated model if you could just

use

936

a simple linear regression, because that's

what I use most of the time for my

937

baseline model.

938

Right?

939

Baseline model, just use a simple linear

regression, and then do all the fancy

940

modeling you want and compare that to the

linear regression, both in predictions and

941

with the Loo algorithm.

942

And well, if there is a good reason to

make your life more difficult, then use

943

it.

944

But otherwise, why would you?

945

And yeah, actually talking about these

complexities, something I see is also that

946

many, many people, many practitioners

might be hesitant to adopt the patient

947

methods due to the fact that they perceive

them as complex.

948

So I'm wondering yourself, what resources

or strategies would you recommend to those

949

who want to learn and apply patient

techniques in their research?

950

And especially in your field of

psychometrics.

951

Yeah.

952

I think, um, starting with an

understanding of sort of just the output,

953

you know, the basics of if you're, if you

have data and if your responsibility is

954

providing analysis for it, uh, finding

either a package or somebody else's

955

program that makes the coding quick.

956

So like you've mentioned linear

regression, if you use VRMS and R, you

957

know, which will translate that into Stan.

958

You can quickly go about getting a

Bayesian result fast.

959

And I found that to me, the conceptual

consideration of what a posterior

960

distribution is actually is less complex

than we think about when we think about

961

all the things that we're drilled into in

the classical methods, like, you know,

962

what, where does the standard error come

from and all this other, you know,

963

asymptotic features in Bayes it's, it's

visible, like you can see a posterior

964

distribution, you can plot it, you can,

you know, touch it, almost like touch it

965

and feel it, right?

966

It's right there in front of you.

967

So for me, I think the thing I try to get

people to first is just to understand what

968

the outputs are.

969

Sort of what are the key parts of it.

970

And then, you know, hopefully that gives

that mental representation of where that,

971

where they're moving toward.

972

And then at that point, start to add in

all the complexities.

973

Um, but it is, I think it's, it's

incredibly challenging to try to, to teach

974

Bayesian methods and I actually think the

further along a person goes, not learning

975

the Bayesian version of things.

976

Makes it even harder because now you have

all this well-established, um, can we say

977

routines or statistics that you're used to

seeing that are not Bayesian, uh, that may

978

or may not have a direct, um, analog in

the Bayes world.

979

Um, but that may not be a bad thing.

980

So, um, thinking about it, actually, I'm

going to take a step back here.

981

Can conceptually, I think it's, this is

the challenge, um, we face in a program

982

like I do right here.

983

I'm working right now.

984

I work with, um, nine other tenure track.

985

or Tender to Tender Tech faculty, which is

a very large program.

986

And we have a long-running curriculum, but

sort of the question I like to ask is,

987

what do we do with Bayes?

988

Do we have a parallel track in Bayes?

989

Do we do Bayes in every class?

990

Because that's a heavy lift for a lot of

people as well.

991

Right now, it's, I teach the Bayes

classes, and occasionally some of my

992

colleagues will put Bayesian statistics in

their classes, but it's tough.

993

I think if I were

994

you know, anointed myself king of how we

do all the curriculum.

995

I don't know the answer I'd come to.

996

I go back and forth each way.

997

So, um, I would love to see what a

curriculum looks like where they only

998

started with base and only kept it in

base.

999

Cause I think that would be a lot of fun.

Speaker:

Um, and the quit, the thought question I

asked myself that I don't have an answer

Speaker:

for is would that be a better mechanism to

get students up to speed on the models

Speaker:

they're using, then it would be in other

contexts and other classical contexts, I

Speaker:

don't, I don't know.

Speaker:

Yeah.

Speaker:

Yay.

Speaker:

Good point.

Speaker:

Yeah, two things.

Speaker:

First, King of Curriculum, amazing title.

Speaker:

I think it should actually be renamed to

that title in all campuses around the

Speaker:

world.

Speaker:

The world's worst kingdom is the

curriculum.

Speaker:

Yeah.

Speaker:

I mean, that's really good.

Speaker:

Like you're going to party, you know, and

so what are we doing on King of

Speaker:

Curriculum?

Speaker:

So long as the crown is on the head,

that's all that matters, right?

Speaker:

That would drop some jaws for sure.

Speaker:

And second, I definitely would like the

theory of the multiverse to be true,

Speaker:

because that means in one of these

universes, there is at least one where

Speaker:

Bayesian methods came first.

Speaker:

And I am definitely curious to see what

that world looks like and see how...

Speaker:

Yeah, what...

Speaker:

What's that world where people were

actually exposed to patient methods first

Speaker:

and maybe to frequency statistics later?

Speaker:

Were they actually exposed to frequency

statistics later?

Speaker:

That's the question.

Speaker:

No, but yeah, jokes aside, I would be

definitely curious about that.

Speaker:

Yeah, well, I don't know that I'll have

that experiment in my lifetime, but maybe

Speaker:

like in a parallel universe somewhere.

Speaker:

Before we close up the show, I'm wondering

if you have a personal anecdote or example

Speaker:

of a challenging problem you encountered

in your research or teaching related to

Speaker:

vision stats and how you were able to

navigate through it?

Speaker:

Yeah.

Speaker:

I mean, maybe it's too much in the weeds,

but that first experience I was in

Speaker:

graduate school trying to learn.

Speaker:

code.

Speaker:

It was coding a correlation matrix of

tetrachore correlations.

Speaker:

And that was incredibly difficult.

Speaker:

One day, one of my colleagues, Bob Henson,

figured it out with the likelihood

Speaker:

function and so forth.

Speaker:

But that was the holdup that we had.

Speaker:

And it's incredible because I say this

because again, we're not, I mentioned it.

Speaker:

do a lot of my own package coding or

whatnot.

Speaker:

But I think you see a similar phenomenon

if you misspecify something in your model

Speaker:

in general and you get results and the

results are either all over the place or

Speaker:

entire number line.

Speaker:

For me, it was the correlations, posterior

distribution looked like a uniform

Speaker:

distribution from negative one to one.

Speaker:

That was, that's a bad thing to see,

right?

Speaker:

So just the, the anecdote I have with this

is, it's less, I guess it's less like

Speaker:

awesome, like when you're like, oh, Bayes

did this and then.

Speaker:

couldn't have done it otherwise, but it's

more the perseverance that goes to

Speaker:

sticking with the Bayesian side, which is,

um, Bayes also provides you the ability to

Speaker:

check a little bit of your work to see if

it's completely gone sideways.

Speaker:

Right.

Speaker:

So, uh, you see a result like that.

Speaker:

You have that healthy dose of skepticism.

Speaker:

You start to investigate more in my case,

it took years, a couple of years of my

Speaker:

life, uh, working in concert with other

people, uh, as grad students, but, um,

Speaker:

was fixed, it was almost obvious that it

was.

Speaker:

I mean, it was, you went from this uniform

distribution across negative one to one to

Speaker:

something that looked very much like a

posterior distribution that we're used to

Speaker:

seeing, send around a certain value of the

correlation.

Speaker:

And again, it was, for us, it was figuring

out what the likelihood was, but for most

Speaker:

packages, at least that's not a big deal.

Speaker:

I think it's already specified in your

choice of model and prior.

Speaker:

But at the same time, just remembering

that

Speaker:

Uh, it's sort of the, the frustration part

of it, not making it work is actually

Speaker:

really informative.

Speaker:

Uh, you get that and you, you can build

and you can sort of check your work if you

Speaker:

go forward analytically.

Speaker:

I mean, not analytically brute force, the

sampling part, but that's sort of a check

Speaker:

on your work.

Speaker:

Trying to say, so not a great example, not

a super inspiring example, but, um, more

Speaker:

perseverance pays off in days and in life.

Speaker:

So it's sort of the analog that I get from

it.

Speaker:

Yeah.

Speaker:

Yeah, no, for sure.

Speaker:

I mean, um,

Speaker:

is perseverance is so important because

you're definitely going to encounter

Speaker:

issues.

Speaker:

I mean, none of your models is going to

work as you thought it would.

Speaker:

So if you don't have that drive and that

passion for the thing that you're

Speaker:

standing, it's going to be extremely hard

to just get it through the finish line

Speaker:

because it's not going to be easy.

Speaker:

So, you know, it's like choosing a new

sport.

Speaker:

If you don't like what the sport is all

about, you're not going to stick with it

Speaker:

because it's going to be hard.

Speaker:

So that perseverance, I would say, come

from your curiosity and your passion for

Speaker:

your field and the methods you're using.

Speaker:

And the other thing I was going to add,

this is tangential, but let me just add

Speaker:

it, you have the chance to go visit Bay's

grave in London, take it.

Speaker:

I had to do that last summer.

Speaker:

I just, I was in London, I had my children

with me and we all picked some spot we

Speaker:

wanted to go to.

Speaker:

And I was like, I'm going to go find and

take a picture in front of Bayes grave.

Speaker:

And I sort of brought up an interesting

question.

Speaker:

Like I don't know the etiquette of taking

photographs in front of a deceased grave

Speaker:

site.

Speaker:

This is at least providing it.

Speaker:

But then ironically, as you're sitting

there, as I was sitting there on the tube,

Speaker:

leaving, I sat next to a woman and she had

Bayes theorem on her shirt.

Speaker:

It was the Bayes School of Economics.

Speaker:

So something like this.

Speaker:

in London, I was like, it was like, okay,

I have reached the Mecca.

Speaker:

Like the perseverance led to like, like a

trip, you know, my own version of the trip

Speaker:

to, to London.

Speaker:

Uh, but definitely, uh, definitely worth

the time to go.

Speaker:

If you want to be surrounded, uh, once you

reach that, that level of perseverance,

Speaker:

uh, you're part of the club and then you

can do things like that.

Speaker:

Fine vacations around, you know, holidays

around base, base graves.

Speaker:

Yeah.

Speaker:

I mean.

Speaker:

I am definitely gonna do that.

Speaker:

Thank you very much for giving me another

idea of a nerd holiday.

Speaker:

My girlfriend is gonna hate me, but she

always wanted to visit London, so you

Speaker:

know, that's gonna be my bait.

Speaker:

It's not bad to get to, it's off of Old

Street, you know, actually well marked.

Speaker:

I mean the grave site's a little

weathered, but it's in a good spot, a good

Speaker:

part of town, so you know, not really

heavily touristy, amazingly.

Speaker:

Oh yeah, I'm guessing.

Speaker:

But you know.

Speaker:

I am guessing that's the good thing.

Speaker:

Yeah, no, I already know how I'm gonna ask

her.

Speaker:

Honey, when I go to London?

Speaker:

Perfect.

Speaker:

Let's go to Bay's.

Speaker:

Let's go check out Bay's Grave.

Speaker:

Yeah, I mean, that's perfect.

Speaker:

That's amazing.

Speaker:

So say, I mean, you should send me that

picture and that should be your picture

Speaker:

for these episodes.

Speaker:

I always take a picture from guests to

illustrate the episode icon, but you

Speaker:

definitely need that.

Speaker:

picture for your icon.

Speaker:

I can do that.

Speaker:

I'll be happy to.

Speaker:

Yeah.

Speaker:

Awesome.

Speaker:

Definitely.

Speaker:

So before asking you the last two

questions, I'm just curious how you see

Speaker:

the future of patient stats in the context

of psychological sciences and

Speaker:

psychometrics.

Speaker:

And what are some exciting avenues for

research and application that you envision

Speaker:

in the coming years or that you would

really like to see?

Speaker:

Oh, that's a great question.

Speaker:

Terrible.

Speaker:

So I, you know, interestingly, in

psychology, you know, quantitative

Speaker:

psychology sort of been on a downhill

swing for, I don't know, 5060 years,

Speaker:

there's fewer and fewer programs, at least

in the United States, where people are

Speaker:

training.

Speaker:

But despite that, I feel like the use of

Bayesian statistics is up in a lot of a

Speaker:

lot of different other areas.

Speaker:

And I think that I think that affords a

bit.

Speaker:

better model-based science.

Speaker:

So you have to specify a model, you have

to model in mind, and then you go and do

Speaker:

that.

Speaker:

I think that benefit makes the science

much better.

Speaker:

You're not just using sort of what's

always been done.

Speaker:

You can sort of push the envelope

methodologically a bit more.

Speaker:

And I think that that, and Bayesian

statistics in one way, another benefit of

Speaker:

them is now you can code an algorithm that

likely will work without having to know,

Speaker:

like you said, all of the underpinnings,

the technical side of things, you can use

Speaker:

an existing package to do so.

Speaker:

I like to say that that's going to

continue to make science a better

Speaker:

practice.

Speaker:

I think the fear that I have is sort of

the sea of the large language model-based

Speaker:

version of what we're doing in machine

learning, artificial intelligence.

Speaker:

But I will be interested to see how we

incorporate a lot of the Bayesian ideas,

Speaker:

Bayesian methods into that as well.

Speaker:

I think that there's potential.

Speaker:

Clearly, people are doing this, I mean,

that's what runs a lot of what is

Speaker:

happening anyway.

Speaker:

So I look forward to seeing that as well.

Speaker:

So I get a sense that what we're talking

about is really what may be the foundation

Speaker:

for what the future will be.

Speaker:

I mean, maybe we will, maybe instead of

that parallel universe, if we could come

Speaker:

back or go into the future just in our own

universe in 50 years, maybe what we will

Speaker:

see is curriculum entirely on Bayesian

methods.

Speaker:

And from, you know, I just looked at your.

Speaker:

topic list you had recently talking about

variational inference and so forth.

Speaker:

The use of that in very large models

themselves, I think that is very important

Speaker:

stuff.

Speaker:

So it may just be the thing that crowds

out everything else, but that's

Speaker:

speculative and I don't make a living

making prediction, unfortunately.

Speaker:

So that's the best I can do.

Speaker:

Yeah.

Speaker:

Yeah, yeah.

Speaker:

I mean, that's also more of a wishlist

question.

Speaker:

So that's all good.

Speaker:

Yeah.

Speaker:

Awesome.

Speaker:

Well, John, amazing.

Speaker:

I learned a lot.

Speaker:

We covered a lot of topics.

Speaker:

I'm really happy.

Speaker:

But of course, before letting you go, I'm

going to ask you the last two questions I

Speaker:

ask every guest at the end of the show.

Speaker:

Number one, you had unlimited time and

resources.

Speaker:

Which problem would you try to solve?

Speaker:

Well, I would be trying to figure out how

we know what a student knows every day of

Speaker:

the year so that we can best teach them

where to go next.

Speaker:

That would be it.

Speaker:

Right now, there's not only the problem of

the technical issues of estimation,

Speaker:

there's also the problem of how do we best

assess them, how much time do they spend

Speaker:

doing it and so forth.

Speaker:

That to me is what I would spend most of

my time on.

Speaker:

That sounds like a good project.

Speaker:

I love it.

Speaker:

And second question, if you could have

dinner with any great scientific mind that

Speaker:

life are fictional.

Speaker:

who did be.

Speaker:

All right.

Speaker:

I got a really obscure choice, right?

Speaker:

It's not like I'm picking Einstein or

anything.

Speaker:

I really, I have like two actually, I've

sort of debated.

Speaker:

One is economist Paul Krugman, who writes

for the New York Times, works at City

Speaker:

University of New York now.

Speaker:

You know, Nobel laureate.

Speaker:

Loved his work, loved his understanding

of, for the interplay between model and

Speaker:

data and understanding is fantastic.

Speaker:

So I would just.

Speaker:

sit there and just have to listen to

everything you had to say, I think.

Speaker:

The other is there's a, again, obscure

thing.

Speaker:

One of my things I'm fascinated by is

weather and weather forecasting.

Speaker:

Uh, if you know, I'm in education or

psychological measurement.

Speaker:

Uh, and there's a guy who started the

company called the weather underground.

Speaker:

His name is Jeff Masters.

Speaker:

Uh, you can read his work on a blog at

Yale these days, climate connections,

Speaker:

something along those lines.

Speaker:

Anyway, since sold the company, but he's

fascinating about modeling, you know,

Speaker:

Right now we're in the peak of hurricane

season in the United States.

Speaker:

We see these storms coming off of Africa

or spinning up everywhere and sort of the

Speaker:

interplay between, unfortunately, the

climate change and then other atmospheric

Speaker:

dynamics.

Speaker:

This just makes for an incredibly complex

system that's just fascinating and how

Speaker:

science approaches prediction there.

Speaker:

So I find that to be great.

Speaker:

But those are the two.

Speaker:

I had to think a lot about that because

there's so many choices, but those two

Speaker:

people are the ones I read the most,

certainly when it's not just in my field.

Speaker:

Nice.

Speaker:

Yeah, sounds fascinating.

Speaker:

And weather forecasting is definitely

incredible.

Speaker:

Also, because the great thing is you have

feedback every day.

Speaker:

So that's really cool.

Speaker:

You can improve your predictions.

Speaker:

Like the missing data problem.

Speaker:

You can't sample every part of the

atmosphere.

Speaker:

So how do you incorporate that into your

analysis as well?

Speaker:

No, that's incredible.

Speaker:

Multiple average models and stuff.

Speaker:

Anyway, yeah.

Speaker:

Yeah, that's also a testimony to the power

of modeling and parsimony, you know, where

Speaker:

it's like, because I worked a lot on

electoral forecasting models and, you

Speaker:

know, classic way people dismiss models in

these areas.

Speaker:

Well, you cannot really predict what

people are going to do at an individual

Speaker:

level, which is true.

Speaker:

I mean, you cannot, people have free will,

you know, so you cannot predict at an

Speaker:

individual level what they are going to

do, but you can.

Speaker:

quite reliably predict what masses are

going to do.

Speaker:

Yeah, basically, where the aggregation of

individual points, you can actually kind

Speaker:

of reliably do it.

Speaker:

And so the power of modeling here where

you get something that, yeah, you know,

Speaker:

it's not good.

Speaker:

It's, you know, the model is wrong, but it

works because it simplifies

Speaker:

things, but doesn't simplify them to a

point where it doesn't make sense anymore.

Speaker:

Kind of like the standard model in

physics, where we know it doesn't work, it

Speaker:

breaks at some point, but it does a pretty

good job of predicting a lot of phenomena

Speaker:

and we observe.

Speaker:

So, do you prefer that?

Speaker:

Is it free will or is it random error?

Speaker:

Well, you have to come back for another

episode on that because otherwise, yes.

Speaker:

That's a good one.

Speaker:

Good point.

Speaker:

Nice.

Speaker:

Well, Jonathan, thank you so much for your

time.

Speaker:

As usual, I will put resources and a link

to your website in the show notes for

Speaker:

those who want to dig deeper.

Speaker:

Thank you again, Jonathan, for taking the

time and being on this show.

Speaker:

Happy to be here.

Speaker:

Thanks for the opportunity.

Speaker:

It was a pleasure to speak with you and I

hope it makes sense for a lot of people.

Speaker:

Appreciate your time.