Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
In this episode, Jonathan Templin, Professor of Psychological and Quantitative Foundations at the University of Iowa, shares insights into his journey in the world of psychometrics.
Jonathan’s research focuses on diagnostic classification models — psychometric models that seek to provide multiple reliable scores from educational and psychological assessments. He also studies Bayesian statistics, as applied in psychometrics, broadly. So, naturally, we discuss the significance of psychometrics in psychological sciences, and how Bayesian methods are helpful in this field.
We also talk about challenges in choosing appropriate prior distributions, best practices for model comparison, and how you can use the Multivariate Normal distribution to infer the correlations between the predictors of your linear regressions.
This is a deep-reaching conversation that concludes with the future of Bayesian statistics in psychological, educational, and social sciences — hope you’ll enjoy it!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca and Dante Gates.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉
Links from the show:
- Jonathan’s website: https://jonathantemplin.com/
- Jonathan on Twitter: https://twitter.com/DrJTemplin
- Jonathan on Linkedin: https://www.linkedin.com/in/jonathan-templin-0239b07/
- Jonathan on GitHub: https://github.com/jonathantemplin
- Jonathan on Google Scholar: https://scholar.google.com/citations?user=veeVxxMAAAAJ&hl=en&authuser=1
- Jonathan on Youtube: https://www.youtube.com/channel/UC6WctsOhVfGW1D9NZUH1xFg
- Jonathan’s book: https://jonathantemplin.com/diagnostic-measurement-theory-methods-applications/
- Jonathan’s teaching: https://jonathantemplin.com/teaching/
- Vehtari et al. (2016), Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC: https://arxiv.org/abs/1507.04544
- arviz.plot_compare: https://python.arviz.org/en/stable/api/generated/arviz.plot_compare.html
- LBS #35, The Past, Present & Future of BRMS, with Paul Bürkner: https://learnbayesstats.com/episode/35-past-present-future-brms-paul-burkner/
- LBS #40, Bayesian Stats for the Speech & Language Sciences, with Allison Hilger and Timo Roettger: https://learnbayesstats.com/episode/40-bayesian-stats-speech-language-sciences-allison-hilger-timo-roettger/
- Bayesian Model-Building Interface in Python: https://bambinos.github.io/bambi/
Abstract
You have probably unknowingly already been exposed to this episode’s topic – psychometric testing – when taking a test at school or university. Our guest, Professor Jonathan Templin, tries to increase the meaningfulness of these tests by improving the underlying psychometric models, the bayesian way of course!
Jonathan explains that it is not easy to judge the ability of a student based on exams since they have errors and are only a snapshot. Bayesian statistics helps by naturally propagating this uncertainty to the results.
In the field of psychometric testing, Marginal Maximum Likelihood is commonly used. This approach quickly becomes unfeasible though when trying to marginalise over multidimensional test scores. Luckily, Bayesian probabilistic sampling does not suffer from this.
A further reason to prefer Bayesian statistics is that it provides a lot of information in the posterior. Imagine taking a test that tells you what profession you should pursue at the end of high school. The field with the best fit is of course interesting, but the second best fit may be as well. The posterior distribution can provide this kind of information.
After becoming convinced that Bayes is the right choice for psychometrics, we also talk about practical challenges like choosing a prior for the covariance in a multivariate normal distribution, model selection procedures and more.
In the end we learn about a great Bayesian holiday destination, so make sure to listen till the end!
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.
Transcript
In this episode, Jonathan Templin,
professor of Psychological and
2
Quantitative Foundations at the University
of Iowa, shares insight into his journey
3
in the world of psychometrics.
4
Jonathan's research focuses on diagnostic
classification models, psychometric models
5
that seek to provide multiple reliable
scores from educational and psychological
6
assessment.
7
He also studies patient statistics as
applied in psychometrics, broadly.
8
So naturally, we discussed the
significance of psychometrics in
9
psychological sciences and how Bayesian
methods are helpful in this field.
10
We also talked about challenges in
choosing appropriate prior distributions,
11
best practices for model comparison, and
how you can use the multivariate normal
12
distribution to infer the correlations
between the predictors of your linear
13
progressions.
14
This is a deep, reaching conversation that
15
concludes with the future of Bayesian
statistics in Psychological, Educational,
16
and Social Sciences.
17
Hope you'll enjoy it.
18
This is Learning Bayesian Statistics,
episode 94, recorded September 11, 2023.
19
Hello, my dear Bayesians!
20
This time, I have the pleasure to welcome
three new members to our Bayesian crew,
21
Bart Trudeau, Noes Fonseca, and Dante
Gates.
22
Thank you so much for your support, folks.
23
It's the main way this podcast gets
funded.
24
And Bart and Dante, get ready to receive
your exclusive merch in the coming month.
25
Send me a picture, of course.
26
Now let's talk psychometrics and modeling
with Jonathan Templin.
27
Jonathan Templin, welcome to learning
patient statistics.
28
Thank you for having me.
29
It's a pleasure to be here.
30
Yeah, thanks a lot.
31
Quite a few patrons have mentioned you in
the Slack of the show.
32
So I'm very honored to honor their request
and have you on the show.
33
And actually thank you folks for bringing
me all of those suggestions and allowing
34
me to discover so many good patients out
there in the world doing awesome things in
35
a lot of different fields using our.
36
favorite tools to all of us based in
statistics.
37
So Jonathan, before talking about all of
those good things, let's dive into your
38
origin story.
39
How did you come to the world of
psychometrics and psychological sciences
40
and how sinuous of a path was it?
41
That's a good question.
42
So I was an odd student, I dropped out of
high school.
43
So I started my...
44
college degree and community college, that
would be the only place that would take
45
me.
46
I happened to be really lucky to do that
though, because I had some really great
47
professors and I took a, once I discovered
that I probably could do school, I took a
48
statistics course, you know, typical
undergraduate basic statistics.
49
I found that I loved it.
50
I decided that I wanted to do something
with statistics and then in the process, I
51
took a research methods class in
psychology and I decided somehow I wanted
52
to do statistics in psychology.
53
So moved on from community college, went
to my undergraduate for two years at
54
Sacramento state and Sacramento,
California also was really lucky because I
55
had professor there that said, Hey,
there's this field called quantitative
56
psychology.
57
You should look into it.
58
If you're interested in statistics and
psychology along the same time, he was
59
teaching me something called factor
analysis.
60
I now look at it as more principal
components analysis, but I wanted to know
61
what was happening underneath the hood of
factor analysis.
62
And so that's where he said, no, really,
you should go to the graduate school for
63
that.
64
And so that's what started me.
65
I was fortunate enough to be able to go to
the University of Illinois for graduate
66
studies.
67
I did a master's, a PhD there, and in the
process, that's where I learned all about
68
Bates.
69
So it was a really lucky route, but it all
wouldn't have happened if I didn't go to
70
community college, so I'm really proud to
say I'm a community college graduate, if
71
you will.
72
Yeah.
73
Nice.
74
Yeah.
75
So it kind of happened.
76
somewhat easily in a way, right?
77
Good meeting at the right time and boom.
78
That's right.
79
And the call of the eigenvalue is what
really sent me to graduate school.
80
So I wanted to figure out what that was
about.
81
Yes, that is a good point.
82
And so nowadays,
83
What are you doing?
84
How would you define the work you're doing
and what are the topics that you are
85
particularly interested in?
86
I would put my work into the field of item
response theory, largely.
87
I do a lot of multidimensional item
response theory.
88
There are derivative fields I think I'm
probably most known for, one of which is
89
something called cognitive diagnosis or
diagnostic classification modeling.
90
Basically, it's a classification based
method to try to...
91
Classify students, or I work in the
College of Education, so most of this is
92
applied to educational data from
assessments, and our goal is to, whenever
93
you take a test, not just give you one
score, give you multiple valid scores, try
94
to maximize the information we can give
you.
95
My particular focus these days is in doing
so in classroom-based assessments, so how
96
do we understand what a student knows at a
given point in the academic year and try
97
to help make sure that they make the most
progress they can.
98
Not.
99
to remove the impact of the teacher
actually to provide the teacher with the
100
best data to work with the child, to work
with the parents, to try to move forward.
101
But all that boils down to interesting
measurements, psychometric issues, and
102
interesting ways that we look at test data
that come out of classrooms.
103
Okay.
104
Yeah, that sounds fascinating.
105
Basically trying to give a distribution of
results instead of just one point
106
estimate.
107
That's it also and tests have a lot of
error.
108
So making sure that we don't over deliver
when we have a test score.
109
Basically understanding what that is and
accurately quantifying how much
110
measurement error is or lack of
reliability there is in the score itself.
111
Yeah, that's fascinating.
112
I mean, we can already dive into that.
113
I have a lot of questions for you, but it
sounds very interesting.
114
So yeah.
115
So what does it look like concretely?
116
these measurement errors and the test
scores attached to them, and basically how
117
do you try to solve that?
118
Maybe you can take an example from your
work where you are trying to do that.
119
Absolutely.
120
Let me start with the classical example.
121
If this is too much information, I
apologize.
122
But to set the stage, for a long time in
item response theory, we understand that a
123
person's...
124
Latentability estimate, if you want to
call it that, is applied in education.
125
So this latent variable that represents
what a person knows, it's put onto the
126
continuum where items are.
127
So basically items and people are sort of
ordered.
128
However, the properties of the model are
such that how much error there might be in
129
a person's point estimate of their score
depends on where the score is located on
130
the continuum.
131
So this is what, you know, theory gave
rise to, you know, theory in the 1970s
132
gave rise to our modern computerized
adaptive assessments and so forth, that
133
sort of pick an item that would minimize
the error, if you will, different ways of
134
describing what we pick an item for.
135
But that's basically the idea.
136
And so from a perspective of where I'm at
with what I do, a complicating factor in
137
this, so that architecture that I just
mentioned that
138
historic version of adaptive assessments
that really been built on large scale
139
measures.
140
So thousands of students and really what
happens in a classical census you would
141
take a marginal maximum likelihood
estimate of certain parameter values from
142
the model.
143
You'd fix those values as if you knew them
with certainty and then you would go and
144
estimate a person's parameter value along
with their standard error conditional
145
standard error measurement.
146
The situations I work in don't have large
sample size but we all in addition to
147
a problem with sort of the asthmatotic
convergence, if you will, of those models,
148
we also have a, not only we have not have
large sample sizes, we also have multiple,
149
multiple scores effectively, multiple
latent freqs that we can't possibly do.
150
So when you look at the same problem from
a Bayesian lens, sort of an interesting
151
feature happens that we don't often see,
you know,
152
frequentness or a classical framework in
that process of fixing the parameters of
153
the model, the item parameters to a value,
you know, disregards any error in the
154
estimate as well.
155
Whereas if you're in a simultaneous
estimate, for instance, in a markup chain
156
where you're sampling these values from a
posterior in addition to sampling
157
students, it turns out those that error
around those parameters can propagate to
158
the students and provide a wider interval
around them, which I think is a bit more
159
accurate, particularly in smaller sample
size.
160
situation.
161
So I hope that's the answer to your
question.
162
I may have taken a path that might have
been a little different there, but that's
163
where I see the value at least in using
Bayesian statistics and what I do.
164
Yeah, no, I love it.
165
Don't shy away from technical explanation
on these podcasts.
166
That's the good thing of the podcast.
167
Don't have to shy away from it.
168
It came at a good time.
169
I've been working on this, some problems
like this all day, so I'm probably in the
170
weeds a little bit.
171
Forgive me if I go at the deep end of it.
172
No, that's great.
173
And we already mentioned item response
theory on the show.
174
So hopefully people will refer back to
these episodes and that will give them a
175
heads up.
176
Well, actually you mentioned it, but do
you remember how you first got introduced
177
to Bayesian methods and why did they stick
with you?
178
Very, very much.
179
I was introduced because in graduate
school, I had the opportunity to work for
180
a lab run by Bill Stout at the University
of Illinois with other very notable people
181
in my career, at least Jeff Douglas, Louis
Roussos, among others.
182
And I was hired as a graduate research
assistant.
183
And my job was to take a program that was
a metropolis Hastings algorithm and to
184
make it run.
185
And it was written in Fortran.
186
So basically, I
187
It was Metropolis Hastings, Bayesian, and
it was written in language that I didn't
188
know with methods I didn't know.
189
And so I was hired and said, yeah, figure
it out with good luck.
190
Thankfully, I had colleagues that could
help actually probably figure it out more
191
than I did.
192
But I was very fortunate to be there
because it's like a trial by fire.
193
I was basically going line by line through
that.
194
This was a little bit in the later part
of, I think it was the year 2001, maybe a
195
little early 2002.
196
But something instrumental to me at the
time were a couple papers by a couple
197
scholars in education at least, Rich Pates
and Brian Junker had a paper in 1999,
198
actually two papers in 1999, I can even,
you know, it's like Journal of Educational
199
Behavioral Statistics.
200
It's like I have that memorized.
201
But in their algorithm, they had written
down the algorithm itself and it was a
202
matter of translating that to the
diagnostic models that we were working on.
203
But that is why it stuck with me because
it was my job, but then it was also
204
incredibly interesting.
205
It was not like a lot of the research that
I was reading and not like a lot of the
206
work I was doing in a lot of the classes I
was in.
207
So I found it really mentally stimulating,
entirely challenging.
208
It took the whole of my brain to figure
out.
209
And even then I don't know that I figured
it out.
210
So that helps answer that question.
211
Yeah.
212
So basically it sounds like you were
thrown into the Beijing pool.
213
Like you didn't have any choice.
214
I was.
215
When I was Bayesian, it was nice because
at the time, you know, this is 2001, 2002,
216
in education, no measurement in
psychology.
217
You know, we knew of Bayes certainly, you
know, there's some great papers from the
218
nineties that were around, but, you know,
we weren't, it wasn't prominent.
219
It wasn't, you know, I was in graduate
school, but at the same time I wasn't
220
learning it, I mean, I knew the textbook
Bayes, like the introductory Bayes, but
221
not, definitely not.
222
Like the estimation side.
223
And so it was timing wise, you know,
people would look back now and say, okay,
224
why didn't I go grab Stan or grab, at the
time I think we had, Jets didn't exist,
225
there was bugs.
226
And it was basically, you have to, you
know, like roll your own to do anything.
227
So it was, it was good.
228
No, for sure.
229
Like, yeah, no, it's like telling, it's
like asking Christopher Columbus or
230
That's right.
231
It's a lot more direct.
232
Just hop on the plane and...
233
Wasn't an option.
234
Exactly.
235
Good point.
236
But actually nowadays, what are you using?
237
Are you still doing your own sampler like
that in Fortran or are you using some open
238
source software?
239
I can hopefully say I retired from Fortran
as much as possible.
240
Most of what I do is install these days a
little bit of JAGS, but then occasionally
241
I will...
242
trying to write my own here or there.
243
The latter part I'd love to do more of,
because you can get a little highly
244
specialized.
245
I just like that, I feel like the time to
really deeply do the development work in a
246
way that doesn't just have an R package or
some package in Python that would just
247
break all the time.
248
So I'm sort of stuck right now with that,
but it is something that I'm grateful for
249
having the contributions of others to be
able to rely upon to do estimation.
250
Sorry.
251
Yeah, no, exactly.
252
I mean,
253
So first, Stan, I've heard he's quite
good.
254
Of course, it's amazing.
255
A lot of Stan developers have been on this
show, and they do absolutely tremendous
256
work.
257
And yeah, as you were saying, why code
your own sampler when you can rely on
258
samplers that are actually waterproof,
that are developed by a bunch of very
259
smart people who do a lot of math.
260
and who do all the heavy lifting for you,
well, just do that.
261
And thanks to that, Bayesian computing and
statistics are much more accessible
262
because you don't have to actually know
how to code your own MCMC sampler to do
263
it.
264
You can stand on the shoulders of giants
and just use that and superpower your own
265
analysis.
266
So it's definitely something we tell
people, don't code your own samplers now.
267
You don't need to do that unless you
really, really have to do it.
268
But usually, when you have to do that, you
know what you're doing.
269
Otherwise, people have figured that out
for you.
270
Just use the automatic samplers from Stan
or Pimsy or Numpyro or whatever you're
271
using.
272
It's usually extremely robust and checked
by a lot of different pairs of eyes and
273
keyboards.
274
having that team and like you said, full
of people who are experts in not only just
275
mathematics, but also computer science
makes a big difference.
276
Yeah.
277
I mean, I would not be able to use patient
statistics nowadays if these samplers
278
didn't exist, right?
279
Because I'm not a mathematician.
280
So if I had to write my own sample each
time, I would just be discouraged even
281
before starting.
282
Yeah.
283
It's just a challenge in and of itself.
284
I remember the old days where
285
That would be it.
286
That's my dissertation.
287
That was what I had to do.
288
So it was like six months work on just the
sampler.
289
And even then it wasn't very good.
290
And then they might actually do the
studying.
291
Yeah, exactly.
292
Yeah.
293
I mean, to me really that probabilistic
programming is one of the super power of
294
the Beijing community because that really
allows.
295
almost anybody who can code in R or Python
or Julia to just use what's being done by
296
very competent and smart people and for
free.
297
Right.
298
Yeah.
299
Also true.
300
Yeah.
301
What a great community.
302
I'm really, really impressed with the size
and the scope and how things have
303
progressed in just 20 years.
304
It's really something.
305
Yeah.
306
Exactly.
307
And so actually...
308
Do you know why, well, do you have an idea
why Bayesian statistics is useful in your
309
field?
310
What do they bring that you don't get with
the classical framework?
311
Yeah, in particular, we have a really
nasty...
312
If we were to do a classical framework,
typically the gold standard in...
313
the field I work in is sort of a marginal
maximum likelihood.
314
The marginal mean we get rid of the latent
variable to estimate models.
315
So that process of marginalization is done
numerically.
316
We numerically integrate across likelihood
function.
317
Most cases, there are some special case
models that we really are too simplistic
318
to use for what we do where we don't have
it.
319
So if we want to do multidimensional
versions
320
If you think about numeric integration,
for one dimension you have this sort of
321
discretized set of a likelihood to take
sums across different, what we call
322
quadrature points of some type of curve.
323
For the multidimensional sense now, going
from one to two, you effectively squared
324
the number of points you have.
325
So that's just too latent variable.
326
So if you want two bits of information
from an assessment from somebody, now
327
you've just made your
328
marginalization process exponentially more
difficult, more time-consuming.
329
But really, the benefit of having two
scores is very little compared to having
330
one.
331
So if we wanted to do five or six or 300
scores, that marginalization process
332
becomes really difficult.
333
So from a brute force perspective, if we
take the a Bayesian sampler perspective,
334
there is not the exponential increase of
computation in the linear increase in the
335
latent variables.
336
And so from a number of steps the process
has to take from calculation is much
337
smaller.
338
Now, of course, Markov chains have a lot
of calculations.
339
So, you know, maybe overall the process is
longer, but it is, I found it to be
340
necessity, basing statistics to estimate
in some form shows up in this
341
multidimensional likelihood, basically
evaluation.
342
created sort of hybrid versions of EM
algorithms where the E-step is replaced
343
with the Bayesian type method.
344
But for me, I like the full Bayesian
approach to everything.
345
So I would say that just in summary
though, what Bayes brings from a brute
346
force perspective is the ability to
estimate our models in a reasonable amount
347
of time with a reasonable amount of
computations.
348
There's the added benefit of what I
mentioned previously, which
349
which is the small sample size, sort of
the, I think, a proper accounting or
350
allowing of error to propagate in the
right way if you're going to report scores
351
and so forth, I think that's an added
benefit.
352
But from a primary perspective, I'm here
because I have a really tough integral to
353
solve and Bayes helps me get around it.
354
Yeah, that's a good point.
355
And yeah, like as you were saying, I'm
guessing that having priors
356
And generative modeling helps for low
sample sizes, which tends to be the case a
357
lot in your field.
358
Also true.
359
Yeah.
360
The prior distributions can help.
361
A lot of the frustration with
multidimensional models and psychometrics,
362
at least in practical sense.
363
You get a set of data, you think it's
multidimensional.
364
The next process is to estimate a model.
365
in the classic sense that those models
sometimes would fail to converge.
366
Uh, and very little reason why, um,
oftentimes it's failed to emerge.
367
I had a class I taught four or five years
ago where I just asked people to estimate
368
five dimensions and not a single person
couldn't could get, I had a set of data
369
for each person.
370
Not a single person could get it in
marriage with the default options that
371
you'd see that like an IRT package.
372
Um, so having the ability to sort of.
373
Understand potentially where
non-convergence or why that's happening,
374
which parameters are finding a difficult
spot.
375
Then using priors to sort of aid an
estimation as one part, but then also sort
376
of the idea of the Bayesian updating.
377
If you're trying to understand what a
student knows throughout the year,
378
Bayesian updating is perfect for such
things.
379
You know, you can assess a student in
November and update their results that you
380
have potentially from previous parts in
the year as well, too.
381
So there's a lot of benefits.
382
I guess I could keep going.
383
I'm talking to a BASE podcast, so probably
I already know most of it.
384
Yeah.
385
I mean, a lot of people are also listening
to understand what BASE is all about and
386
how that could help them in their own
field.
387
So that's definitely useful if we have
some psychometricians in the audience who
388
haven't tried yet some BASE, well, I'm
guessing that would be useful for them.
389
And actually, could you share an example?
390
If you have one of a research project
where BASE and stats played a
391
a crucial role, ideally in uncovering
insights that might have been missed
392
otherwise, especially using traditional
stats approaches?
393
Yeah, I mean, just honestly, a lot of what
we do just estimating the model itself, it
394
sounds like it should be trivial.
395
But to do so with a full information
likelihood function is so difficult.
396
I would say almost every single analysis
I've done using a multidimensional
397
has been made possible because of the
Bayesian analyses themselves.
398
Again, there are shortcut methods you
would call that.
399
I think there are good methods, but again,
there are people, like I mentioned, that
400
sort of a hybrid marginal maximum
likelihood.
401
There's what we would call limited
information approaches that you might see
402
in programs like M plus, or there's an R
package named Laban that do such things.
403
But those only use functions of the data,
not the full data themselves.
404
I mean, it's still good, but it's sort of
I have this sense that the full likelihood
405
is what we should be using.
406
So to me, just a simple example, take a, I
was working this morning with a four
407
dimensional assessment, an assessment, you
know, 20 item test, kids in schools.
408
And you know, I would have a difficult
time trying to estimate that with a full
409
maximum likelihood method.
410
And so Bayes made that possible.
411
But beyond that, if we ever want to do
something with the test scores afterwards,
412
right?
413
So now we have a bunch of Markov chains of
people's scores themselves.
414
This makes it easy to be able to then not
forget that these scores are not measured
415
perfectly.
416
And take a posterior distribution and use
that in a secondary analysis as well, too.
417
So I was doing some work with one of the
Persian Gulf states where they were trying
418
to
419
like a vocational interest survey.
420
And some of the classical methods for
this, sort of they disregarded any error
421
whatsoever.
422
And they basically said, oh, you're
interested in, I don't know, artistic work
423
or you know, numeric work of some sort.
424
And they would just tell you, oh, that's
it.
425
That's your story.
426
Like, I don't know if you've ever taken
one of those.
427
What are you gonna do in a career?
428
You're in a high school student and you're
trying to figure this out.
429
But if you propagate, if you allow that
error to sort of propagate,
430
through the way Bayesian methods make it
very easy to do, you'll see that while
431
that may be the most likely choice of what
you're interested in or what your sort of
432
dimensions that may be most salient to you
in your interests, there are many other
433
choices that may even be close to that as
well.
434
And that would be informative as well too.
435
So we sort of forget, we sort of overstate
how certain we are in results.
436
And I think a lot of the Bayesian methods
built around it.
437
So
438
That was one actually project where I did
write the own algorithm for it to try to
439
estimate these things because it was just
a little more streamlined.
440
But it seemed it seemed that would rather
than telling a high school student, hey,
441
you're best at artistic things.
442
What we could say is, hey, yeah, you may
be best at artistic, but really close to
443
that is something that's numeric, you
know, like something along those lines.
444
So while you're strong at art.
445
You're really strong at math too.
446
Maybe you should consider one of these two
rather than just go down a path that may
447
or may not really reflect your interests.
448
Hope that's a good example.
449
Yeah.
450
Yeah, definitely.
451
Yeah, thanks.
452
And I understand how that would be useful
for sure.
453
And how does, I'm curious about the role
of priors in all that, because that's
454
often something that puzzles beginners.
455
And so you obviously have a lot of
experience in the Bayesian way of life in
456
your field.
457
So I'm curious, I'm guessing that you kind
of teach the way to do psychometric
458
analysis in the Bayesian framework to a
lot of people.
459
And I'm curious, especially on the prior
side, and if there are other interesting
460
things that you would like to share on
that, feel free.
461
My question is on the priors.
462
How do you approach the challenge of
choosing appropriate prior distributions,
463
especially when you're dealing with
complex models?
464
Great question.
465
And I'm sure each field does it a little
bit differently.
466
I mean, as it probably should, because
each field has its own data and models and
467
already established scientific knowledge.
468
So that's my way of saying.
469
This is my approach.
470
I'm 100% confident that it's the approach
that everybody should take.
471
But let me back it up a little bit.
472
So generally speaking, I teach a lot of
students who are going into, um, many of
473
our students end up in the industry for
educational measurement here in the United
474
States.
475
Um, I like, we usually denote our score
parameters with theta.
476
I like to go around saying that, yeah, I'm
teaching you to have to sell
477
That's sort of what they do, you know, in
a lot of these industry settings, they're
478
selling test scores.
479
So if you think that that's what you're
trying to do, I think that guides to me a
480
set of prior choices that try to do the
least amount of speculation.
481
So what I mean by that.
482
So if you look at a measurement model,
like an item response model, you know,
483
there's a set of parameters to it.
484
One parameter in particular, in item
response theory, we call it the
485
discrimination parameter or
486
Factor analysis, we call it factor
loading, and linear regression, it would
487
be a slope.
488
This parameter tends to govern the extent
to which an item relates to the latent
489
variable.
490
So the higher that parameter is, the more
that item relates.
491
Then when we go and do a Bayes theorem to
get a point estimate of a person's score
492
or a posterior distribution of that
person's score, the contribution of that
493
item.
494
is largely reflected by the magnitude of
that parameter.
495
The higher the parameter that is, the more
that item has weight on that distribution,
496
the more we think we know about a person.
497
So in doing that, when I look at setting
prior choices, what I try to do for that
498
is to set a prior that would be toward
zero, mainly, actually at zero mostly, try
499
to set it so that we want our data to tell
more of the job than our prior,
500
particularly if we're trying to, if this
score has a big,
501
uh, meaning to somebody you think of, um,
well, in the United States, the assessment
502
culture is a little bit out of control,
but, you know, we have to take tests to go
503
to college.
504
We have to take tests to go to graduate
school and so forth.
505
Uh, then of course, if you go and work in
certain industries, there's assessments to
506
do licensure, right?
507
So if you, you know, for instance, my
family is a, I come from that family of
508
nurses, uh, it's a very noble profession,
but to, to be licensed in a nurse in
509
California, you have to pass an exam.
510
provide that score for the exam that we're
not, that score reflects as much of the
511
data as possible unless a prior choice.
512
And so there are ways that, you know,
people can sort of use priors, they're
513
sort of not necessarily empirical science
benefit, you can sort of put too much
514
subjective weight onto it.
515
So when I talk about priors, when I talk
about the, I try to talk about the
516
ramifications of the choice of prior on
certain parameters, that discrimination
517
parameter or slope, I tend to want
518
to have the data to force it to be further
away from zero because then I'm being more
519
conservative, I feel like.
520
The rest of the parameters, I tend to not
use heavy priors on what I do.
521
I tend to use some very uninformative
priors unless I have to.
522
And then the most complicated prior for
what we do, and the one that's caused
523
historically the biggest challenge,
although it's, I think, relatively in good
524
place these days thanks to research and
science, is the prior that goes on a
525
covariance or correlation matrix.
526
That had been incredibly difficult to try
to estimate back in the day.
527
But now things are much, much easier in
modern computing, in modern ways of
528
looking, modern priors actually.
529
Yeah, interesting.
530
Would you like to walk us a bit through
that?
531
What are you using these days on priors on
correlation or covariance matrices?
532
Because, yeah, I do teach those also
because...
533
I love it.
534
Basically, if you're using, for instance,
a linear regression and want to estimate
535
not only the correlation of the
parameters, the predictors on the outcome,
536
but also the correlation between the
predictors themselves and then using that
537
additional information to make even better
prediction on the outcome, you would, for
538
instance, use a multivariate normal on the
parameters on your slopes.
539
of your linear regression, for instance,
what primaries do you use on that
540
multivariate?
541
What does the multivariate normal mean?
542
And a multivariate normal needs a
covariance matrix.
543
So what primaries do you use on the
covariance matrix?
544
So that's basically the context for
people.
545
Now, John, basically try and take it from
there.
546
What are you using in your field these
days?
547
Yeah, so going with your example, I have
no idea.
548
You know, like, if you have a set of
regression coefficients that you say are
549
multivariate normal, yes, there is a place
for a covariance in the prior.
550
I never try to speculate what that is.
551
I don't think I have, like, the human
judgment that it takes to figure out what
552
the, like, the belief, your prior belief
is for that.
553
I think you're talking about what would be
analogous to sort of the, like, the
554
asthmatotic covariance matrix.
555
The posterior distribution of these
parameters where you look at the
556
covariance between them is like the
asymptotic covariance matrix in ML, and we
557
just rarely ever speculate off of the
diagonal, it seems like, on that.
558
I mean, there are certainly uses for
linear combinations and whatnot, but
559
that's tough.
560
I'm more thinking about, like, when I have
a handful of latent variables and try to
561
estimate, now the problem is I need a
covariance matrix between them, and
562
they're likely to be highly correlated,
right?
563
So...
564
In our field, we tend to see correlations
of psychological variables that are 0.7,
565
0.8, 0.9.
566
These are all academic skills in my field
that are coming from the same brain.
567
The child has a lot of reasons why those
are going to be highly correlated.
568
And so these days, I love the LKJ prior
for it.
569
It makes it easy to put a prior on a
covariance matrix and then if you want to
570
rescale it.
571
That's one of the other weird features of
the psychometric world is that because
572
these variables don't exist, to estimate
covariance matrix, we'd have to make
573
certain constraints on the, on some of the
item parameters, the measurement model for
574
instance.
575
If we want a variance of the factor, we
have to set one of the parameters of the
576
discrimination parameters to a value to be
able to estimate it.
577
Otherwise, it's not identified.
578
work that we talk about for calibration
when we're trying to build scores or build
579
assessments and their data for it, we fix
that value of the variance of a factor to
580
one.
581
We standardize the factor zero, meaning
variance one, very simple idea.
582
The models are equivalent in a classic
sense, in that the likelihoods are
583
equivalent, whether we do one way or the
other.
584
When we put products on the posteriors
aren't entirely equivalent, but that's a
585
matter of a typical Bayesian issue with
transformations.
586
But
587
In the sense where we want a correlation
matrix, prior to the LKJ, prior, there
588
were all these sort of, one of my mentors,
Rod McDonald, called devices, little hacks
589
or tricks that we would do to sort of keep
covariance matrix, sample it, right?
590
I mean, you think about statistically to
sample it, I like a lot of rejection
591
sampling methods.
592
So if you were to basically propose a
covariance or correlation matrix, it has
593
to be positive.
594
semi-definite, that's a hard term.
595
It has to be, you have to make sure that
the correlation is bounded and so forth.
596
But LKJ takes care of almost all of that
for me in a way that allows me to just
597
model the straight correlation matrix,
which has really made life a lot easier
598
when it comes to estimation.
599
Yeah, I mean, I'm not surprised that does.
600
I mean, that is also the kind of priors I
tend to use personally and that I teach
601
also.
602
In this example, for instance, of the
linear regression, that's what I probably
603
end up using LKJPrior on the predictors on
the slopes of the linear regression.
604
And for people who don't know,
605
Never used LKJ prior.
606
LKJ is decomposition of the covariance
matrix.
607
That way, we can basically sample it.
608
Otherwise, it's extremely hard to sample
from a covariance matrix.
609
But the LKJ decomposition of the matrix is
a way to basically an algebraic trick.
610
that makes use of the Cholesky
decomposition of a covariance matrix that
611
allows us to sample the Cholesky
decomposition instead of the covariance
612
matrix fully, and that helps the sampling.
613
Thank you.
614
Thank you for putting that out there.
615
I'm glad you put that on.
616
Yeah, so yeah.
617
And basically, the way you would
parametrize that, for instance, in Poem C,
618
you would
619
use pm.lkj, and basically you would have
to parameterize that with at least three
620
parameters, the number of dimensions.
621
So for instance, if you have three
predictors, that would be n equals 3.
622
The standard deviation that you are
expecting on the predictors on the slopes
623
of the linear regression, so that's
something you're used to, right?
624
If you're using a normal prior on the
slope, then the sigma of the slope is just
625
standard deviation that you're expecting
on that effect for your data and model.
626
And then you have to specify a prior on
the correlation of these slopes.
627
And that's where you get into the
covariance part.
628
And so basically, you can specify a prior.
629
So that would be called eta in PIME-Z on
the LKJ prior.
630
And the
631
bigger eta, the more suspicious of high
correlations your prior would be.
632
So if eta equals 1, you're basically
expecting a uniform distribution of
633
correlations.
634
That could be minus 1, that could be 1,
that could be 0.
635
All of those have the same weight.
636
And then if you go to eta equals 8, for
instance, you would put much more prior
637
weight on correlations eta.
638
Close to zero, much of them will be close
to zero in 0.5 minus 0.5, but it would be
639
very suspicious of very big correlations,
which I guess would make a lot of sense,
640
for instance, social science.
641
I don't know in your field, but yeah.
642
I typically use the uniform, the one
setting, at least to start with, but yeah,
643
I think that's a great description.
644
Very good description.
645
Yeah, I really love these kinds of models
because they make linear regression even
646
more powerful.
647
To me, linear regression is so powerful
and very underrated.
648
You can go so far with plain linear
regression and often it's hard to really
649
do better.
650
You have to work a lot to do better than a
really good linear regression.
651
I completely agree with you.
652
Yeah, I'm 100% right there.
653
And actually then you get into sort of
the...
654
quadratic or the nonlinear forms in linear
regression that map onto it that make it
655
even more powerful.
656
So yeah, it's absolutely wonderful.
657
Yeah, yeah.
658
And I mean, as Spider-Man's uncle said,
great power comes with great
659
responsibility.
660
So you have to be very careful about the
priors when you have all those features,
661
so inversing functions because they
662
the parameter space, but same thing, well,
if you're using a multivariate normal, I
663
mean, that's more complex.
664
So of course you have to think a bit more
about your model structure, about your
665
prior.
666
And also the more structure you add, if
the size of the data is kept equal, well,
667
that means you have more risk for
overfitting and you have less informative
668
power per data point.
669
Let's say so.
670
That means the prior.
671
increase in importance, so you have to
think about them more.
672
But you get a much more powerful model
after once and the goal is to get much
673
more powerful predictions after once.
674
I do agree.
675
These weapons are hard to wield.
676
They require time and effort.
677
And on my end, I don't know for you.
678
Jonathan, but on my end, they also require
a lot of caffeine from time to time.
679
Maybe.
680
Yeah.
681
I mean, so that's the key.
682
You see how I did the segue.
683
I should have a podcast.
684
Yeah.
685
So as a first time I do that in the
podcast, but I had that.
686
Yeah.
687
So I'm a big coffee drinker.
688
I love coffee.
689
I'm a big coffee nerd.
690
But from time to time, I try to decrease
my caffeine usage, you know, also because
691
you have some habituation effects.
692
So if I want to keep the caffeine shot
effect, well, I have to sometimes do a
693
decrease of my usage.
694
And funnily enough, when I was thinking
about that, a small company called Magic
695
Mind, they came to me...
696
They sent me an email and they listened to
the show and they were like, hey, you've
697
got a cool show.
698
I would be happy to send you some bottles
for you to try and to talk about it on the
699
show.
700
And I thought that was fun.
701
So I got some Magic Mind myself.
702
I drank it, but I'm not going to buy
Jonathan because I got Magic Mind to send
703
some samples to Jonathan.
704
And if you are watching the YouTube video,
Jonathan is going to try the Magic Mind
705
right now, live.
706
So yeah, take it away, Jon.
707
Yeah, this is interesting because you
reached out to me for the podcast and I
708
had not met you, but you know, it's a
conversation, it's a podcast, you have to
709
do great work.
710
Yes, I'll say yes to that.
711
Then you said, how would you like to try
the Magic Mind?
712
And I thought...
713
being a psych major as an undergraduate,
this is an interesting social psychology
714
experiment where a random person from the
internet says, hey, I'll send you
715
something.
716
So I thought there's a little bit of
safety in that by drinking it in front of
717
you while we're talking on the podcast.
718
But of course, I know you can cut this out
if I hit the floor, but here it comes.
719
So you're drinking it like, sure.
720
Yeah, I decided to drink it like a shot,
if you will.
721
It was actually tasted much better than I
expected.
722
It came in a bottle with green.
723
It tasted tangy, so very good.
724
And now the question will be, if I get
better at my answers to your questions by
725
the end of the podcast, therefore we have
now a nice experiment.
726
But no, I noticed it has a bit of
caffeine, certainly less than a cup of
727
coffee.
728
But at the same time, it doesn't seem
offensive whatsoever.
729
Yeah, that's pretty good.
730
Yeah, I mean, I'm still drinking caffeine,
if that's all right.
731
But yeah, from time to time, I like to
drink it.
732
My habituation, my answer to that is just
drink more.
733
That's fine.
734
Yeah, exactly.
735
Oh yeah, and decaf and stuff like that.
736
But yeah, I love the idea of the product
is cool.
737
I liked it.
738
So I was like, yeah, I'm going to give it
a shot.
739
And so the way I drank it was also
basically making myself a latte
740
coffee, I would use the Magic Pint and
then I would put my milk in the milk foam.
741
And that is really good.
742
I have to say.
743
See how that works.
744
Yeah.
745
So it's based on, I mean, the thing you
taste most is the matcha, I think.
746
And usually I'm not a big fan of matcha
and that's why I give it the green color.
747
I think usually I'm not, but I had to say,
I really appreciated that.
748
You and me both, I was feeling the same
way.
749
When I saw it come in the mail, I was
like, ooh, that added to my skepticism,
750
right?
751
I'm trying to be a good scientist.
752
I'm trying to be like, yeah.
753
But yeah, it was actually surprisingly,
tasted more like a juice, like a citrus
754
juice than it was matcha.
755
So it was much nicer than I expected.
756
Yeah, I love that because me too, I'm
obviously extremely skeptical about all
757
those stuff.
758
So.
759
I like doing that.
760
It's way better, way more fun to do it
with you or any other nerd from the
761
community than doing it with normal people
from the street because I'm way too
762
skeptical for them.
763
They wouldn't even understand my
skepticism.
764
I agree.
765
I felt like in a scientific community,
I've seen some of the people you've had on
766
the podcast, we're all a little bit
skeptical about what we do.
767
I could bring that skepticism here and I'd
feel like at home, hopefully.
768
I'm glad that you allowed me to do that.
769
Yeah.
770
And that's the way of life.
771
Thanks for trusting me because I agree
that seeing from a third party observer,
772
you'd be like, that sounds like a scam.
773
That guy is just inviting me to sell him
something to me.
774
In a week, he's going to send me an email
to tell me he's got some financial
775
troubles and I have to wire him $10,000.
776
Waiting for that or is it, what level of
paranoia do I have this morning?
777
I was like, well, who are my enemies and
who really wants to do something bad to
778
me?
779
Right?
780
So, I don't believe I'm at that level.
781
So I don't think I have anything to worry
about.
782
It seems like a reputable company.
783
So it was, it was amazing.
784
Yeah.
785
No, that was good.
786
Thanks a lot MagicMine for sending me
those samples, that was really fun.
787
Feel free to give it a try, other people
if you want, if that sounded like
788
something you'd be interested in.
789
And if you have any other product to send
me, send them to me, I mean, that sounds
790
fun.
791
I mean, I'm not gonna say yes to
everything, you know, I have standards on
792
the show, and especially scientific
standards.
793
But you can always send me something.
794
And I will always analyze it.
795
You know, somehow you can work out an
agreement with the World Cup, right?
796
Some World Cup tickets for the next time.
797
True.
798
That would be nice.
799
True.
800
Yeah, exactly.
801
Awesome.
802
Well, what we did is actually kind of
related, I think, I would say to the
803
other, another aspect of your work.
804
And that is model comparison.
805
So, and it's again, a topic that's asked a
lot by students.
806
Especially when they come from the
classical machine learning framework where
807
model comparison is just everywhere.
808
So often they ask how they can do that in
the Bayesian framework.
809
Again, as usual, I am always skeptical
about just doing model comparison and just
810
picking your model based on some one
statistic.
811
I always say there is no magic one
matching bullet, you know, in the Bayesian
812
framework where it's just, okay, model
comparisons say that, so for sure.
813
That's the best model.
814
I wouldn't say that's how it works.
815
And you would need a collection of
different indicators, including, for
816
instance, the LOO, the LOO factor, that
tells you, yeah, that model is better.
817
But not only that, what about the
posterior predictions?
818
What about the model structure?
819
What about the priors?
820
What about just the generative story about
the model?
821
But talking about model comparison, what
can you tell us, John, about the
822
some best practices for carrying out
effective model comparisons?
823
Kajen is best practice.
824
I'll just give you what my practice is.
825
I will make no claim that it's best.
826
It's difficult.
827
I think you hit on all the aspects of it
in introducing the topic.
828
If you have a set of models that you're
considering, the first thing I'd like to
829
think about is not the comparison between
them as much as how each model would fit a
830
data set of data
831
post-serial predictive model checking is,
you know, from an amazing sense is where
832
really a lot of the work for me is focused
around.
833
Interestingly, what you choose to check
against is a bit of a challenge,
834
particularly, you know, in certain fields
in psychometrics, at least the ones I'm
835
familiar with.
836
I do see a lot of, first of all, model
fit,
837
well-researched area in psychometrics in
general.
838
Really, there's millions of papers in the
1980s, maybe not millions, but it seems
839
like that many.
840
And then another, it's always been
something that people have studied.
841
I think recently there's been a resurgence
of new ideas in it as well.
842
So it's well-covered territory from the
psychometric literature.
843
It's less well-covered, at least in my
view, in Bayesian psychometrics.
844
So what I've tried to do,
845
with my work to try to see if a model fits
absolutely is to look at, there's this,
846
one of the complicating factors is that a
lot of my data is discrete.
847
So it's correct and correct scored items.
848
And in that sense, in the last 15, 20
years, there's been some good work in the
849
non-Bayesian world about how to use what
we call limited information methods to
850
assess model fit.
851
So instead of,
852
looking at model fit to the entire
contingency table.
853
So if you have a set of binary data, let's
say 10 variables that you've observed,
854
technically you have 1,024 different
probabilities that have permutations of
855
ways they could be zeros and ones.
856
And model fit should be built toward that
1,024 vector of probabilities.
857
Good luck with that, right?
858
You're not gonna collect enough data to do
that.
859
And so...
860
What a group of scientists Alberto Medeo
Alavarez, Lissai and others have created
861
are sort of model fit to lower level
contingency tables.
862
So each marginal moment of the day, each
mean effectively, and then like a two-way
863
table between all pairs of observed
variables.
864
In work that I've done with a couple of
students recently, we've tried to
865
replicate that idea, but more on a
Bayesian sentence.
866
So could we come up with
867
and M, like a statistic, this is called an
M2 statistic.
868
Could we come up with a version of a
posterior predictive check for what a
869
model says the two-way table should look
like?
870
And then similar to that, could we create
a model such that we know saturates that?
871
So for instance, if we have 10 observed
variables, we could create a model that
872
has all 10 shoes to two-way tables
estimated perfect, what we would expect to
873
be perfect.
874
Now, of course, there's posterior
distributions, but you would expect with
875
you know, plenty of data and, you know,
very diffused priors that you would get
876
point estimates, EAP estimates, and that
should be right about where you can
877
observe the frequencies of data.
878
Quick check.
879
So, um, the idea then is now we have two
models, one of which we know should fit
880
the data absolutely.
881
And one of which we know, uh, we're, we're
wondering if it fits now that the
882
comparison comes together.
883
So we have these two predictive
distributions.
884
Um, how do we compare them?
885
Uh, and that's where, you know,
886
different approaches we've taken.
887
One of those is just simply looking at the
distributional overlaps.
888
We tried to calculate a, we use the
Kilnogorov Smirnov distribution, sort of
889
the sea where moments are percent wise of
the distributions with overlap, because if
890
your model's data overlaps with what you
think that the data should look like, you
891
think the model fits well.
892
And if it doesn't, it should be far apart
and won't fit well.
893
That's how we've been trying to build.
894
It's weird because it's a model
comparison, but one of the comparing
895
models we know to be
896
what we call saturated, it should fit the
data the best and no other model, all the
897
other models should be subsumed into it.
898
So that's the approach I've taken recently
with posterior predictive checks, but then
899
a model comparison.
900
We could have used, as you mentioned, the
LOO factor or the LOO statistic.
901
And maybe that's something that we should
look into also.
902
We haven't yet, but one of my recent
graduates, new assistant professor at
903
University of Arkansas here in the United
States.
904
Ji Hang Zhang had done a lot of work on
this in his dissertation and other studies
905
here.
906
So that's sort of the approach I take.
907
The other thing I want to mention though
is when you're comparing amongst models,
908
you have to establish that model for that
absolute fit first.
909
So the way I envision this is you sort of
compare your model to this sort of
910
saturated model.
911
You do that for multiple versions of your
models and then effectively choose amongst
912
the set of models you're comparing that
sort of fit.
913
But what that absolute fit is, is like you
mentioned, it's nearly impossible to tell
914
exactly.
915
There's a number of ideas that go into
what makes a good for a good fitting
916
model.
917
Yeah.
918
And definitely I encourage people to go
take a look at the Lou paper.
919
I will put a link in the show note to that
paper.
920
And also if you're using Arvies, whether
in Julia or Python, we do have.
921
implementation of the Loo algorithm.
922
So comparing your models with obviously
extremely simple, it's just a call to
923
compare and then you can even do a plot of
that.
924
And yeah, as you were saying, the Loo
algorithm doesn't have any meaning by
925
itself.
926
Right?
927
The Loo score of a model doesn't mean
anything.
928
It's in comparison to another, to other
models.
929
So yeah, basically having a baseline model
that you think is already good enough.
930
And then all the other models have to
compare to that one, which basically could
931
be like the placebo, if you want, or the
already existing solution that there is
932
for that.
933
And then any model that's more complicated
than that should be in competition with
934
that one and should have a reason to be
used, because otherwise, why are you using
935
a more complicated model if you could just
use
936
a simple linear regression, because that's
what I use most of the time for my
937
baseline model.
938
Right?
939
Baseline model, just use a simple linear
regression, and then do all the fancy
940
modeling you want and compare that to the
linear regression, both in predictions and
941
with the Loo algorithm.
942
And well, if there is a good reason to
make your life more difficult, then use
943
it.
944
But otherwise, why would you?
945
And yeah, actually talking about these
complexities, something I see is also that
946
many, many people, many practitioners
might be hesitant to adopt the patient
947
methods due to the fact that they perceive
them as complex.
948
So I'm wondering yourself, what resources
or strategies would you recommend to those
949
who want to learn and apply patient
techniques in their research?
950
And especially in your field of
psychometrics.
951
Yeah.
952
I think, um, starting with an
understanding of sort of just the output,
953
you know, the basics of if you're, if you
have data and if your responsibility is
954
providing analysis for it, uh, finding
either a package or somebody else's
955
program that makes the coding quick.
956
So like you've mentioned linear
regression, if you use VRMS and R, you
957
know, which will translate that into Stan.
958
You can quickly go about getting a
Bayesian result fast.
959
And I found that to me, the conceptual
consideration of what a posterior
960
distribution is actually is less complex
than we think about when we think about
961
all the things that we're drilled into in
the classical methods, like, you know,
962
what, where does the standard error come
from and all this other, you know,
963
asymptotic features in Bayes it's, it's
visible, like you can see a posterior
964
distribution, you can plot it, you can,
you know, touch it, almost like touch it
965
and feel it, right?
966
It's right there in front of you.
967
So for me, I think the thing I try to get
people to first is just to understand what
968
the outputs are.
969
Sort of what are the key parts of it.
970
And then, you know, hopefully that gives
that mental representation of where that,
971
where they're moving toward.
972
And then at that point, start to add in
all the complexities.
973
Um, but it is, I think it's, it's
incredibly challenging to try to, to teach
974
Bayesian methods and I actually think the
further along a person goes, not learning
975
the Bayesian version of things.
976
Makes it even harder because now you have
all this well-established, um, can we say
977
routines or statistics that you're used to
seeing that are not Bayesian, uh, that may
978
or may not have a direct, um, analog in
the Bayes world.
979
Um, but that may not be a bad thing.
980
So, um, thinking about it, actually, I'm
going to take a step back here.
981
Can conceptually, I think it's, this is
the challenge, um, we face in a program
982
like I do right here.
983
I'm working right now.
984
I work with, um, nine other tenure track.
985
or Tender to Tender Tech faculty, which is
a very large program.
986
And we have a long-running curriculum, but
sort of the question I like to ask is,
987
what do we do with Bayes?
988
Do we have a parallel track in Bayes?
989
Do we do Bayes in every class?
990
Because that's a heavy lift for a lot of
people as well.
991
Right now, it's, I teach the Bayes
classes, and occasionally some of my
992
colleagues will put Bayesian statistics in
their classes, but it's tough.
993
I think if I were
994
you know, anointed myself king of how we
do all the curriculum.
995
I don't know the answer I'd come to.
996
I go back and forth each way.
997
So, um, I would love to see what a
curriculum looks like where they only
998
started with base and only kept it in
base.
999
Cause I think that would be a lot of fun.
Speaker:
Um, and the quit, the thought question I
asked myself that I don't have an answer
Speaker:
for is would that be a better mechanism to
get students up to speed on the models
Speaker:
they're using, then it would be in other
contexts and other classical contexts, I
Speaker:
don't, I don't know.
Speaker:
Yeah.
Speaker:
Yay.
Speaker:
Good point.
Speaker:
Yeah, two things.
Speaker:
First, King of Curriculum, amazing title.
Speaker:
I think it should actually be renamed to
that title in all campuses around the
Speaker:
world.
Speaker:
The world's worst kingdom is the
curriculum.
Speaker:
Yeah.
Speaker:
I mean, that's really good.
Speaker:
Like you're going to party, you know, and
so what are we doing on King of
Speaker:
Curriculum?
Speaker:
So long as the crown is on the head,
that's all that matters, right?
Speaker:
That would drop some jaws for sure.
Speaker:
And second, I definitely would like the
theory of the multiverse to be true,
Speaker:
because that means in one of these
universes, there is at least one where
Speaker:
Bayesian methods came first.
Speaker:
And I am definitely curious to see what
that world looks like and see how...
Speaker:
Yeah, what...
Speaker:
What's that world where people were
actually exposed to patient methods first
Speaker:
and maybe to frequency statistics later?
Speaker:
Were they actually exposed to frequency
statistics later?
Speaker:
That's the question.
Speaker:
No, but yeah, jokes aside, I would be
definitely curious about that.
Speaker:
Yeah, well, I don't know that I'll have
that experiment in my lifetime, but maybe
Speaker:
like in a parallel universe somewhere.
Speaker:
Before we close up the show, I'm wondering
if you have a personal anecdote or example
Speaker:
of a challenging problem you encountered
in your research or teaching related to
Speaker:
vision stats and how you were able to
navigate through it?
Speaker:
Yeah.
Speaker:
I mean, maybe it's too much in the weeds,
but that first experience I was in
Speaker:
graduate school trying to learn.
Speaker:
code.
Speaker:
It was coding a correlation matrix of
tetrachore correlations.
Speaker:
And that was incredibly difficult.
Speaker:
One day, one of my colleagues, Bob Henson,
figured it out with the likelihood
Speaker:
function and so forth.
Speaker:
But that was the holdup that we had.
Speaker:
And it's incredible because I say this
because again, we're not, I mentioned it.
Speaker:
do a lot of my own package coding or
whatnot.
Speaker:
But I think you see a similar phenomenon
if you misspecify something in your model
Speaker:
in general and you get results and the
results are either all over the place or
Speaker:
entire number line.
Speaker:
For me, it was the correlations, posterior
distribution looked like a uniform
Speaker:
distribution from negative one to one.
Speaker:
That was, that's a bad thing to see,
right?
Speaker:
So just the, the anecdote I have with this
is, it's less, I guess it's less like
Speaker:
awesome, like when you're like, oh, Bayes
did this and then.
Speaker:
couldn't have done it otherwise, but it's
more the perseverance that goes to
Speaker:
sticking with the Bayesian side, which is,
um, Bayes also provides you the ability to
Speaker:
check a little bit of your work to see if
it's completely gone sideways.
Speaker:
Right.
Speaker:
So, uh, you see a result like that.
Speaker:
You have that healthy dose of skepticism.
Speaker:
You start to investigate more in my case,
it took years, a couple of years of my
Speaker:
life, uh, working in concert with other
people, uh, as grad students, but, um,
Speaker:
was fixed, it was almost obvious that it
was.
Speaker:
I mean, it was, you went from this uniform
distribution across negative one to one to
Speaker:
something that looked very much like a
posterior distribution that we're used to
Speaker:
seeing, send around a certain value of the
correlation.
Speaker:
And again, it was, for us, it was figuring
out what the likelihood was, but for most
Speaker:
packages, at least that's not a big deal.
Speaker:
I think it's already specified in your
choice of model and prior.
Speaker:
But at the same time, just remembering
that
Speaker:
Uh, it's sort of the, the frustration part
of it, not making it work is actually
Speaker:
really informative.
Speaker:
Uh, you get that and you, you can build
and you can sort of check your work if you
Speaker:
go forward analytically.
Speaker:
I mean, not analytically brute force, the
sampling part, but that's sort of a check
Speaker:
on your work.
Speaker:
Trying to say, so not a great example, not
a super inspiring example, but, um, more
Speaker:
perseverance pays off in days and in life.
Speaker:
So it's sort of the analog that I get from
it.
Speaker:
Yeah.
Speaker:
Yeah, no, for sure.
Speaker:
I mean, um,
Speaker:
is perseverance is so important because
you're definitely going to encounter
Speaker:
issues.
Speaker:
I mean, none of your models is going to
work as you thought it would.
Speaker:
So if you don't have that drive and that
passion for the thing that you're
Speaker:
standing, it's going to be extremely hard
to just get it through the finish line
Speaker:
because it's not going to be easy.
Speaker:
So, you know, it's like choosing a new
sport.
Speaker:
If you don't like what the sport is all
about, you're not going to stick with it
Speaker:
because it's going to be hard.
Speaker:
So that perseverance, I would say, come
from your curiosity and your passion for
Speaker:
your field and the methods you're using.
Speaker:
And the other thing I was going to add,
this is tangential, but let me just add
Speaker:
it, you have the chance to go visit Bay's
grave in London, take it.
Speaker:
I had to do that last summer.
Speaker:
I just, I was in London, I had my children
with me and we all picked some spot we
Speaker:
wanted to go to.
Speaker:
And I was like, I'm going to go find and
take a picture in front of Bayes grave.
Speaker:
And I sort of brought up an interesting
question.
Speaker:
Like I don't know the etiquette of taking
photographs in front of a deceased grave
Speaker:
site.
Speaker:
This is at least providing it.
Speaker:
But then ironically, as you're sitting
there, as I was sitting there on the tube,
Speaker:
leaving, I sat next to a woman and she had
Bayes theorem on her shirt.
Speaker:
It was the Bayes School of Economics.
Speaker:
So something like this.
Speaker:
in London, I was like, it was like, okay,
I have reached the Mecca.
Speaker:
Like the perseverance led to like, like a
trip, you know, my own version of the trip
Speaker:
to, to London.
Speaker:
Uh, but definitely, uh, definitely worth
the time to go.
Speaker:
If you want to be surrounded, uh, once you
reach that, that level of perseverance,
Speaker:
uh, you're part of the club and then you
can do things like that.
Speaker:
Fine vacations around, you know, holidays
around base, base graves.
Speaker:
Yeah.
Speaker:
I mean.
Speaker:
I am definitely gonna do that.
Speaker:
Thank you very much for giving me another
idea of a nerd holiday.
Speaker:
My girlfriend is gonna hate me, but she
always wanted to visit London, so you
Speaker:
know, that's gonna be my bait.
Speaker:
It's not bad to get to, it's off of Old
Street, you know, actually well marked.
Speaker:
I mean the grave site's a little
weathered, but it's in a good spot, a good
Speaker:
part of town, so you know, not really
heavily touristy, amazingly.
Speaker:
Oh yeah, I'm guessing.
Speaker:
But you know.
Speaker:
I am guessing that's the good thing.
Speaker:
Yeah, no, I already know how I'm gonna ask
her.
Speaker:
Honey, when I go to London?
Speaker:
Perfect.
Speaker:
Let's go to Bay's.
Speaker:
Let's go check out Bay's Grave.
Speaker:
Yeah, I mean, that's perfect.
Speaker:
That's amazing.
Speaker:
So say, I mean, you should send me that
picture and that should be your picture
Speaker:
for these episodes.
Speaker:
I always take a picture from guests to
illustrate the episode icon, but you
Speaker:
definitely need that.
Speaker:
picture for your icon.
Speaker:
I can do that.
Speaker:
I'll be happy to.
Speaker:
Yeah.
Speaker:
Awesome.
Speaker:
Definitely.
Speaker:
So before asking you the last two
questions, I'm just curious how you see
Speaker:
the future of patient stats in the context
of psychological sciences and
Speaker:
psychometrics.
Speaker:
And what are some exciting avenues for
research and application that you envision
Speaker:
in the coming years or that you would
really like to see?
Speaker:
Oh, that's a great question.
Speaker:
Terrible.
Speaker:
So I, you know, interestingly, in
psychology, you know, quantitative
Speaker:
psychology sort of been on a downhill
swing for, I don't know, 5060 years,
Speaker:
there's fewer and fewer programs, at least
in the United States, where people are
Speaker:
training.
Speaker:
But despite that, I feel like the use of
Bayesian statistics is up in a lot of a
Speaker:
lot of different other areas.
Speaker:
And I think that I think that affords a
bit.
Speaker:
better model-based science.
Speaker:
So you have to specify a model, you have
to model in mind, and then you go and do
Speaker:
that.
Speaker:
I think that benefit makes the science
much better.
Speaker:
You're not just using sort of what's
always been done.
Speaker:
You can sort of push the envelope
methodologically a bit more.
Speaker:
And I think that that, and Bayesian
statistics in one way, another benefit of
Speaker:
them is now you can code an algorithm that
likely will work without having to know,
Speaker:
like you said, all of the underpinnings,
the technical side of things, you can use
Speaker:
an existing package to do so.
Speaker:
I like to say that that's going to
continue to make science a better
Speaker:
practice.
Speaker:
I think the fear that I have is sort of
the sea of the large language model-based
Speaker:
version of what we're doing in machine
learning, artificial intelligence.
Speaker:
But I will be interested to see how we
incorporate a lot of the Bayesian ideas,
Speaker:
Bayesian methods into that as well.
Speaker:
I think that there's potential.
Speaker:
Clearly, people are doing this, I mean,
that's what runs a lot of what is
Speaker:
happening anyway.
Speaker:
So I look forward to seeing that as well.
Speaker:
So I get a sense that what we're talking
about is really what may be the foundation
Speaker:
for what the future will be.
Speaker:
I mean, maybe we will, maybe instead of
that parallel universe, if we could come
Speaker:
back or go into the future just in our own
universe in 50 years, maybe what we will
Speaker:
see is curriculum entirely on Bayesian
methods.
Speaker:
And from, you know, I just looked at your.
Speaker:
topic list you had recently talking about
variational inference and so forth.
Speaker:
The use of that in very large models
themselves, I think that is very important
Speaker:
stuff.
Speaker:
So it may just be the thing that crowds
out everything else, but that's
Speaker:
speculative and I don't make a living
making prediction, unfortunately.
Speaker:
So that's the best I can do.
Speaker:
Yeah.
Speaker:
Yeah, yeah.
Speaker:
I mean, that's also more of a wishlist
question.
Speaker:
So that's all good.
Speaker:
Yeah.
Speaker:
Awesome.
Speaker:
Well, John, amazing.
Speaker:
I learned a lot.
Speaker:
We covered a lot of topics.
Speaker:
I'm really happy.
Speaker:
But of course, before letting you go, I'm
going to ask you the last two questions I
Speaker:
ask every guest at the end of the show.
Speaker:
Number one, you had unlimited time and
resources.
Speaker:
Which problem would you try to solve?
Speaker:
Well, I would be trying to figure out how
we know what a student knows every day of
Speaker:
the year so that we can best teach them
where to go next.
Speaker:
That would be it.
Speaker:
Right now, there's not only the problem of
the technical issues of estimation,
Speaker:
there's also the problem of how do we best
assess them, how much time do they spend
Speaker:
doing it and so forth.
Speaker:
That to me is what I would spend most of
my time on.
Speaker:
That sounds like a good project.
Speaker:
I love it.
Speaker:
And second question, if you could have
dinner with any great scientific mind that
Speaker:
life are fictional.
Speaker:
who did be.
Speaker:
All right.
Speaker:
I got a really obscure choice, right?
Speaker:
It's not like I'm picking Einstein or
anything.
Speaker:
I really, I have like two actually, I've
sort of debated.
Speaker:
One is economist Paul Krugman, who writes
for the New York Times, works at City
Speaker:
University of New York now.
Speaker:
You know, Nobel laureate.
Speaker:
Loved his work, loved his understanding
of, for the interplay between model and
Speaker:
data and understanding is fantastic.
Speaker:
So I would just.
Speaker:
sit there and just have to listen to
everything you had to say, I think.
Speaker:
The other is there's a, again, obscure
thing.
Speaker:
One of my things I'm fascinated by is
weather and weather forecasting.
Speaker:
Uh, if you know, I'm in education or
psychological measurement.
Speaker:
Uh, and there's a guy who started the
company called the weather underground.
Speaker:
His name is Jeff Masters.
Speaker:
Uh, you can read his work on a blog at
Yale these days, climate connections,
Speaker:
something along those lines.
Speaker:
Anyway, since sold the company, but he's
fascinating about modeling, you know,
Speaker:
Right now we're in the peak of hurricane
season in the United States.
Speaker:
We see these storms coming off of Africa
or spinning up everywhere and sort of the
Speaker:
interplay between, unfortunately, the
climate change and then other atmospheric
Speaker:
dynamics.
Speaker:
This just makes for an incredibly complex
system that's just fascinating and how
Speaker:
science approaches prediction there.
Speaker:
So I find that to be great.
Speaker:
But those are the two.
Speaker:
I had to think a lot about that because
there's so many choices, but those two
Speaker:
people are the ones I read the most,
certainly when it's not just in my field.
Speaker:
Nice.
Speaker:
Yeah, sounds fascinating.
Speaker:
And weather forecasting is definitely
incredible.
Speaker:
Also, because the great thing is you have
feedback every day.
Speaker:
So that's really cool.
Speaker:
You can improve your predictions.
Speaker:
Like the missing data problem.
Speaker:
You can't sample every part of the
atmosphere.
Speaker:
So how do you incorporate that into your
analysis as well?
Speaker:
No, that's incredible.
Speaker:
Multiple average models and stuff.
Speaker:
Anyway, yeah.
Speaker:
Yeah, that's also a testimony to the power
of modeling and parsimony, you know, where
Speaker:
it's like, because I worked a lot on
electoral forecasting models and, you
Speaker:
know, classic way people dismiss models in
these areas.
Speaker:
Well, you cannot really predict what
people are going to do at an individual
Speaker:
level, which is true.
Speaker:
I mean, you cannot, people have free will,
you know, so you cannot predict at an
Speaker:
individual level what they are going to
do, but you can.
Speaker:
quite reliably predict what masses are
going to do.
Speaker:
Yeah, basically, where the aggregation of
individual points, you can actually kind
Speaker:
of reliably do it.
Speaker:
And so the power of modeling here where
you get something that, yeah, you know,
Speaker:
it's not good.
Speaker:
It's, you know, the model is wrong, but it
works because it simplifies
Speaker:
things, but doesn't simplify them to a
point where it doesn't make sense anymore.
Speaker:
Kind of like the standard model in
physics, where we know it doesn't work, it
Speaker:
breaks at some point, but it does a pretty
good job of predicting a lot of phenomena
Speaker:
and we observe.
Speaker:
So, do you prefer that?
Speaker:
Is it free will or is it random error?
Speaker:
Well, you have to come back for another
episode on that because otherwise, yes.
Speaker:
That's a good one.
Speaker:
Good point.
Speaker:
Nice.
Speaker:
Well, Jonathan, thank you so much for your
time.
Speaker:
As usual, I will put resources and a link
to your website in the show notes for
Speaker:
those who want to dig deeper.
Speaker:
Thank you again, Jonathan, for taking the
time and being on this show.
Speaker:
Happy to be here.
Speaker:
Thanks for the opportunity.
Speaker:
It was a pleasure to speak with you and I
hope it makes sense for a lot of people.
Speaker:
Appreciate your time.