Learning Bayesian Statistics

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag 😉

Takeaways:

  • Designing experiments is about optimal data gathering.
  • The optimal design maximizes the amount of information.
  • The best experiment reduces uncertainty the most.
  • Computational challenges limit the feasibility of BED in practice.
  • Amortized Bayesian inference can speed up computations.
  • A good underlying model is crucial for effective BED.
  • Adaptive experiments are more complex than static ones.
  • The future of BED is promising with advancements in AI.

Chapters:

00:00 Introduction to Bayesian Experimental Design

07:51 Understanding Bayesian Experimental Design

19:58 Computational Challenges in Bayesian Experimental Design

28:47 Innovations in Bayesian Experimental Design

40:43 Practical Applications of Bayesian Experimental Design

52:12 Future of Bayesian Experimental Design

01:01:17 Real-World Applications and Impact

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang and Gary Clarke.

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.

Transcript
Speaker:

Today I am delighted to host Desi Ivanova, a distinguished research fellow in machine

learning at the University of Oxford.

2

00:00:13,564 --> 00:00:24,414

Desi's fascinating journey in statistics has spanned from quantitatifiance to the

frontiers of Bayesian experimental design, or BED, BED.

3

00:00:24,414 --> 00:00:28,620

In our conversation, Desi dives into the deep

4

00:00:28,620 --> 00:00:33,094

world of BED where she has made significant contributions.

5

00:00:33,094 --> 00:00:43,974

She begins by elucidating the core principles of experimental design, discussing both the

theoretical underpinnings and the complex computational challenges that arise in its

6

00:00:43,974 --> 00:00:44,595

application.

7

00:00:44,595 --> 00:00:54,478

Desi shares insights into the innovative solutions she's developed to make BED more

practical and applicable in real-world scenarios, particularly

8

00:00:54,478 --> 00:00:58,618

highlighting its impact in sectors like healthcare and technology.

9

00:00:58,618 --> 00:01:07,258

Throughout the discussion, Desi also touches on the exciting future of BED, especially in

light of recent advancements in AI and machine learning.

10

00:01:07,258 --> 00:01:18,258

She reflects on the critical role of real-time decision-making in today's data-driven

landscape and how patient methods can enhance the speed and accuracy of such decisions.

11

00:01:18,258 --> 00:01:22,766

This is Learning Vision Statistics, episode 117.

12

00:01:22,766 --> 00:01:26,326

recorded September 26, 2024.

13

00:01:42,886 --> 00:01:51,486

Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,

the projects and the people who make it possible.

14

00:01:51,486 --> 00:01:52,886

I'm your host.

15

00:01:52,886 --> 00:01:53,766

Alex and Dora.

16

00:01:53,766 --> 00:01:59,988

You can follow me on Twitter at Alex underscore and Dora like the country for any info

about the show.

17

00:01:59,988 --> 00:02:02,429

Learnbasedats.com is Laplace to be.

18

00:02:02,429 --> 00:02:08,050

Show notes, becoming a corporate sponsor, unlocking Beijing Merch, supporting the show on

Patreon.

19

00:02:08,050 --> 00:02:09,611

Everything is in there.

20

00:02:09,611 --> 00:02:11,521

That's Learnbasedats.com.

21

00:02:11,521 --> 00:02:22,614

If you're interested in one-on-one mentorship, online courses or statistical consulting,

feel free to reach out and book a call at topmate.io slash Alex underscore and Dora.

22

00:02:22,614 --> 00:02:25,936

See you around, folks, and best patient wishes to you all.

23

00:02:25,936 --> 00:02:33,040

And if today's discussion sparked ideas for your business, well, our team at PMC Labs can

help bring them to life.

24

00:02:33,040 --> 00:02:35,021

Check us out at PMC-labs.com.

25

00:02:39,042 --> 00:02:44,367

Hello my dear vegans, today I want to welcome our two new patrons in the full posterior

tier.

26

00:02:44,367 --> 00:02:51,113

Thank you so much Ivy Hwing and Garrett Clark, your support literally makes this show

possible.

27

00:02:51,114 --> 00:02:51,650

I am

28

00:02:51,650 --> 00:02:56,032

looking forward to interacting with you guys in the LBS Slack channel.

29

00:02:56,032 --> 00:03:01,294

Now, before we start the episode, I have a short story for you guys.

30

00:03:01,455 --> 00:03:11,619

A few years ago, I started learning machine stats by watching all the tutorials I could

find that a teacher I really liked was teaching.

31

00:03:11,699 --> 00:03:18,763

That teacher was no other than Chris Fonsbeck, PMC's creator and BDFL.

32

00:03:18,763 --> 00:03:21,584

And five years down the road, senior

33

00:03:21,584 --> 00:03:31,264

unpredictable road, am beyond excited to share that I will now be teaching a tutorial

alongside Chris.

34

00:03:31,264 --> 00:03:36,984

That will happen at Pi Data New York from November 6 to 8, 2024.

35

00:03:36,984 --> 00:03:40,524

And I would be delighted to see you there.

36

00:03:40,524 --> 00:03:45,614

We will be teaching you everything you need to know to master Gaussian processes with IMC.

37

00:03:45,614 --> 00:03:50,524

And of course, I will record a few live LBS episodes while I'm there.

38

00:03:50,524 --> 00:03:51,565

But

39

00:03:51,565 --> 00:03:54,115

I'll tell you more about that in the next episode.

40

00:03:54,115 --> 00:04:00,405

In the meantime, you can get your ticket at pine.at.org slash NYC 2024.

41

00:04:00,485 --> 00:04:02,685

I can't wait to see you there.

42

00:04:02,685 --> 00:04:04,838

Okay, on to the show now.

43

00:04:07,906 --> 00:04:12,387

Desi Ivanova, welcome to Learning Bayesian Statistics.

44

00:04:12,727 --> 00:04:14,268

Thank you for having me, Alex.

45

00:04:14,268 --> 00:04:15,688

Pleased to be here.

46

00:04:15,688 --> 00:04:16,388

Yeah, yeah.

47

00:04:16,388 --> 00:04:19,659

Thanks a lot for taking the time, for being on the show.

48

00:04:19,659 --> 00:04:23,510

a to Marvin Schmidt for putting us in contact.

49

00:04:23,670 --> 00:04:32,233

He was kind enough to do that on the base flow stack where we interact from time to time.

50

00:04:32,493 --> 00:04:36,620

Today, though, we're not going to talk a lot about advertised Bayesian inference.

51

00:04:36,620 --> 00:04:42,636

We're going to talk mostly about experimental design, Beijing experimental design.

52

00:04:42,636 --> 00:04:47,060

So BED or BED, I like the acronym.

53

00:04:47,321 --> 00:04:54,568

But before that, as usual, we'll start with your origin story, Daisy.

54

00:04:54,568 --> 00:05:01,044

Can you tell us what you're doing nowadays and also how you ended up working on what

you're working today?

55

00:05:01,910 --> 00:05:03,291

Yeah, of course.

56

00:05:03,532 --> 00:05:12,469

So broadly speaking, I work in probabilistic machine learning research, where I've worked

on a few different things, actually.

57

00:05:12,469 --> 00:05:16,363

So the audience here would be mostly familiar with Bayesian inference.

58

00:05:16,363 --> 00:05:22,808

So I've worked on approximate inference methods, namely, know, variational inference.

59

00:05:22,808 --> 00:05:24,279

You mentioned Marvin, right?

60

00:05:24,279 --> 00:05:29,994

So we've actually collaborated with him on some amortized inference work.

61

00:05:30,986 --> 00:05:33,667

I've also done some work in causality.

62

00:05:34,448 --> 00:05:46,292

But my main research focus so far has been in an area called Bayesian experimental design,

as you correctly pointed out, BED for short, a nice acronym.

63

00:05:47,333 --> 00:05:52,295

So BED, Bayesian experimental design was the topic of my PhD.

64

00:05:52,295 --> 00:05:55,636

And yeah, will be the topic of this podcast episode.

65

00:05:55,636 --> 00:05:58,177

Yeah, really, really keen on discussing.

66

00:05:58,453 --> 00:06:01,014

and very, very close to my heart.

67

00:06:02,015 --> 00:06:03,976

You know, how I ended up here.

68

00:06:04,797 --> 00:06:06,897

That's actually a bit quite random.

69

00:06:07,878 --> 00:06:19,224

So before, before getting into research, right, so before my PhD, I actually worked in

finance for quite a few years as a, as a quantitative researcher.

70

00:06:20,805 --> 00:06:22,446

At some point,

71

00:06:22,894 --> 00:06:33,474

I really started missing sort of the rigor in a sense of, you know, conducting research,

you know, being quite principled about, you know, how we measure uncertainty, how we

72

00:06:33,474 --> 00:06:38,674

quantify robustness of our models and of the systems that we're building.

73

00:06:39,434 --> 00:06:48,094

And right at the height of COVID, I decided to start my PhD back in 2020.

74

00:06:48,854 --> 00:06:49,990

And

75

00:06:50,016 --> 00:06:57,690

Indeed, the area, right, based on experimental design, that was originally not the topic

of my PhD.

76

00:06:58,011 --> 00:07:02,173

I was supposed to work on certain aspects of variational autoencoders.

77

00:07:03,094 --> 00:07:09,397

If you're familiar with these types of models, they're not as popular anymore, right?

78

00:07:09,397 --> 00:07:17,964

So if I had ended up working on variational autoencoders, I guess a lot of my research

would have been, I mean, not wasted, but not as relevant.

79

00:07:17,964 --> 00:07:22,066

not as relevant today as it was, you know, four or five years ago.

80

00:07:22,066 --> 00:07:35,844

And how I ended up working with Bayesian experimental design specifically, basically,

approached my supervisor a few months before starting my PhD and I said, Hey, can I can I

81

00:07:35,844 --> 00:07:39,356

read about something interesting to prepare for a PhD?

82

00:07:39,356 --> 00:07:42,538

And he was like, yeah, just with these papers on Bayesian experimental design.

83

00:07:42,538 --> 00:07:44,399

And that's how it happened.

84

00:07:44,399 --> 00:07:44,820

Really?

85

00:07:44,820 --> 00:07:45,890

Yeah.

86

00:07:47,968 --> 00:07:48,628

Okay, cool.

87

00:07:48,628 --> 00:07:49,769

Yeah, I love these.

88

00:07:49,769 --> 00:08:05,736

I love asking this question because often, you know, with hindsight bias, when you're a

beginner, you like it's easy to trick yourself into thinking that people like you who are

89

00:08:05,736 --> 00:08:14,638

experts on a on a particular topic and know that topic really well, because they did a PhD

on on it, like they

90

00:08:14,638 --> 00:08:23,198

They have been doing that since they were, I don't know, 18 or even 15 or it was like all,

all planned and part of being big plan.

91

00:08:23,198 --> 00:08:26,778

But most of the time when you ask people, it was not at all.

92

00:08:26,778 --> 00:08:37,938

And it's the result of experimenting with things and also the result of different people

they have met and, and, and encounters and mentors.

93

00:08:37,938 --> 00:08:42,126

And so I think this is also very valuable to, to

94

00:08:42,126 --> 00:08:46,146

tell that to beginners because otherwise they can be very daunting.

95

00:08:47,126 --> 00:08:50,175

100 % Yeah, I would 100 % agree with that.

96

00:08:50,175 --> 00:08:52,106

And actually experimenting is good.

97

00:08:52,106 --> 00:08:55,646

You know, again, we'll be talking about experimental design, I think.

98

00:08:55,646 --> 00:09:02,826

Yeah, many times, you know, just by virtue of trying something new, you discover, you

know, I actually quite liked that.

99

00:09:02,826 --> 00:09:09,206

And it actually works better, you know, for whatever purpose it might be, it might be your

commute to work, right?

100

00:09:09,206 --> 00:09:10,510

There was this

101

00:09:10,510 --> 00:09:12,370

very interesting research.

102

00:09:12,690 --> 00:09:22,310

You know, when there is like a tube closure, right, if the metro is getting closed, you

know, some people, like 5 % of people actually discover an alternative route that actually

103

00:09:22,310 --> 00:09:25,390

is much better for the daily commute.

104

00:09:25,410 --> 00:09:29,770

But they wouldn't have done that had the closure not happened.

105

00:09:29,770 --> 00:09:35,870

So almost like being forced to experiment may lead to actually better outcomes, right?

106

00:09:35,870 --> 00:09:37,950

So it's quite interesting.

107

00:09:38,130 --> 00:09:39,870

Yeah, yeah, no.

108

00:09:39,930 --> 00:09:40,642

mean,

109

00:09:40,642 --> 00:09:54,792

completely agree with that and that's also something I tell to a lot of people who reach

out to me, know, wondering how they could start working on Bayesian stats and often I'm

110

00:09:54,792 --> 00:10:04,279

like, you know, trying to find something you are curious about, interested in and then

start from there because it's gonna be hard stuff and there are gonna be a lot of

111

00:10:04,279 --> 00:10:05,339

obstacles.

112

00:10:05,339 --> 00:10:09,622

So if you're not, you know, really curious about

113

00:10:10,028 --> 00:10:22,028

what you are studying, it's going to be fairly hard to maintain the level of work that you

have to maintain to, to in the end enjoy what you're doing.

114

00:10:22,028 --> 00:10:24,310

So experimenting is very important.

115

00:10:24,310 --> 00:10:25,971

I completely agree.

116

00:10:27,092 --> 00:10:29,754

and actually do you remember yourself?

117

00:10:30,255 --> 00:10:34,618

so I'm curious first how Bajan is your work.

118

00:10:34,718 --> 00:10:39,872

And also if you remember when you were, were introduced to, to Bajan stance.

119

00:10:42,210 --> 00:10:44,561

When was I introduced to Bayesian stats?

120

00:10:44,561 --> 00:10:49,594

That must have been probably in my undergrad days.

121

00:10:50,295 --> 00:11:01,922

I remember I took some courses on kind of Bayesian data analysis, but then I didn't do any

of that during my time in industry.

122

00:11:04,624 --> 00:11:06,365

Yeah.

123

00:11:06,365 --> 00:11:08,738

And again, as I said, I ended up working on

124

00:11:08,738 --> 00:11:12,320

Bayesian experimental design a little bit, a little bit randomly.

125

00:11:12,561 --> 00:11:16,825

The work itself is, you know, it does use Bayesian principles quite a lot.

126

00:11:16,825 --> 00:11:20,518

You know, we do Bayesian inference, we do, we start with a Bayesian model, right?

127

00:11:20,518 --> 00:11:22,750

So the modeling aspect is also quite important.

128

00:11:22,750 --> 00:11:29,295

You know, it's very important to have a good Bayesian model for all of these things to

actually make sense and work well in practice.

129

00:11:29,295 --> 00:11:35,380

So I would say overall, the work is quite, quite Bayesian, right?

130

00:11:36,081 --> 00:11:36,781

Yeah.

131

00:11:36,781 --> 00:11:37,342

Yeah.

132

00:11:37,342 --> 00:11:37,782

Yeah.

133

00:11:37,782 --> 00:11:39,062

Yeah, for sure.

134

00:11:39,723 --> 00:11:45,684

so actually, I think that's a good segue to introduce now, Bayesian experimental design.

135

00:11:45,684 --> 00:11:55,630

So it's the first time we not talk, not the first time we talk about it on the show, but

it's a really dedicated episode about about that.

136

00:11:55,630 --> 00:12:05,754

So could you introduce the topic to our listeners and basically explain and define what

Bayesian experimental design is?

137

00:12:07,020 --> 00:12:08,250

Yeah, of course.

138

00:12:09,211 --> 00:12:14,074

So can I actually take a step back and talk a little bit about experimental design first?

139

00:12:14,074 --> 00:12:14,554

Yeah.

140

00:12:14,554 --> 00:12:14,934

yeah.

141

00:12:14,934 --> 00:12:18,776

And then we'll add the Bayesian kind of the Bayesian aspect to it.

142

00:12:19,497 --> 00:12:28,142

So, you know, when, when I say, I work on Bayesian experimental design, most people

immediately think lab experiments, right?

143

00:12:28,142 --> 00:12:35,245

For example, you're in a chemistry lab and you're trying to synthesize a new drug or a new

compound or something.

144

00:12:36,402 --> 00:12:43,128

But actually, you know, the field of experimental design is much, broader than that,

right?

145

00:12:43,128 --> 00:12:48,753

And to, you know, give a few concrete examples, you can think about surveys, right?

146

00:12:48,753 --> 00:12:52,275

You may need to decide what questions to ask.

147

00:12:52,275 --> 00:13:03,025

Maybe you want to tailor your questions as, you know, the survey progresses so that, you

know, you're asking very tailored, customized questions to each of your survey

148

00:13:03,025 --> 00:13:04,326

participants.

149

00:13:05,719 --> 00:13:08,020

You can think of clinical trials, right?

150

00:13:08,020 --> 00:13:10,991

So how do you dose drugs appropriately?

151

00:13:10,991 --> 00:13:20,746

Or, you know, when should you test for certain properties of these drugs, things like

absorption and so on?

152

00:13:20,746 --> 00:13:30,410

So all of these things can be, you know, cast as a as an experimental design problem, as

an optimal experimental design problem.

153

00:13:31,154 --> 00:13:42,121

So in my mind, designing experiments really boils down to optimal or at least intelligent

data gathering.

154

00:13:42,201 --> 00:13:43,122

Does that make sense?

155

00:13:43,122 --> 00:13:51,677

So we're trying to kind of optimally collect data in order to kind of learn about the

thing that we want to learn about.

156

00:13:51,677 --> 00:13:55,329

So some underlying quantity of interest, right?

157

00:13:57,754 --> 00:14:11,500

And the Bayesian framework, so the Bayesian experimental design framework specifically

takes an information theoretic approach to what intelligent or optimal means in this

158

00:14:11,500 --> 00:14:12,580

context.

159

00:14:12,941 --> 00:14:17,012

So as I already mentioned, it is a is a model based approach, right?

160

00:14:17,012 --> 00:14:26,156

So we start with an underlying Bayesian model that actually describes or simulates the

outcome of our experiment.

161

00:14:27,040 --> 00:14:29,481

And then the optimality part, right?

162

00:14:29,481 --> 00:14:37,264

So the optimal design will be the one that maximizes the amount of information about the

thing that we're trying to learn about.

163

00:14:37,264 --> 00:14:38,895

Yeah.

164

00:14:39,876 --> 00:14:41,376

That makes sense?

165

00:14:41,476 --> 00:14:43,617

can actually give a concrete example.

166

00:14:43,617 --> 00:14:47,319

Maybe that will make it easier for you and for the listeners, right?

167

00:14:47,319 --> 00:14:54,602

So if you think about, you know, the survey, the survey example, right?

168

00:14:56,010 --> 00:15:14,952

kind of a simple but I think very easy to understand concept is you know trying to learn

let's say about time value of money preferences of different people right yeah so what

169

00:15:14,952 --> 00:15:16,703

does that mean imagine your

170

00:15:21,806 --> 00:15:23,076

a behavioral economist, right?

171

00:15:23,076 --> 00:15:27,326

And you're trying to understand some risk preferences, let's say, of people.

172

00:15:27,326 --> 00:15:36,266

Generally, the way that you do that is by asking people a series of questions of the form,

do you prefer some money now or you prefer some money later?

173

00:15:36,266 --> 00:15:36,506

Right?

174

00:15:36,506 --> 00:15:41,486

So do you prefer 50 pounds now or you prefer 100 pounds in one year?

175

00:15:41,926 --> 00:15:42,486

Right.

176

00:15:42,486 --> 00:15:49,866

And then you can choose, are you going to propose 50 pounds or 60 pounds or 100 pounds

now?

177

00:15:51,052 --> 00:15:53,434

how much money you're going to propose in what time, right?

178

00:15:53,434 --> 00:15:57,979

So you're going to do a hundred pounds in one month or in three months or in one year,

right?

179

00:15:57,979 --> 00:16:00,751

So there is like a few choices that you can make.

180

00:16:01,092 --> 00:16:10,701

And there is a strong incentive to do that with as few questions as possible because you

end up paying actually the money to the participants, right?

181

00:16:12,444 --> 00:16:15,066

So basically,

182

00:16:15,806 --> 00:16:27,215

we can start with an underlying Bayesian model that sort of models this type of

preferences of different human participants in this survey.

183

00:16:27,716 --> 00:16:32,359

There's plenty of such models from psychology, from behavioral economics.

184

00:16:34,641 --> 00:16:38,384

And at the end of the day, what we want to learn is a few parameters, right?

185

00:16:38,384 --> 00:16:41,516

You can think about this model almost like a mechanistic

186

00:16:41,790 --> 00:16:55,502

model that explains how preferences relate to things like, you know, are described by

things like a discount factor or sensitivity to various other things.

187

00:16:56,985 --> 00:17:05,051

And by asking these series of questions, we're learning about these underlying parameters

in our Bayesian model.

188

00:17:06,293 --> 00:17:07,584

Did that make sense?

189

00:17:09,974 --> 00:17:10,475

Yeah.

190

00:17:10,475 --> 00:17:11,115

Yeah.

191

00:17:11,115 --> 00:17:12,456

I understand better now.

192

00:17:12,456 --> 00:17:22,385

And so I'm wondering, it sounds a lot like, you know, just doing also causal modeling,

right?

193

00:17:22,385 --> 00:17:34,375

So you write your causal graph and then based on that, you can have a generative model and

then, and fitting the model to data is just one part, but it's not what you start with to

194

00:17:34,375 --> 00:17:35,796

write the model.

195

00:17:36,777 --> 00:17:38,538

How is that related?

196

00:17:39,948 --> 00:17:40,898

Right.

197

00:17:41,999 --> 00:17:55,867

The fields are, in a sense, closely related in the sense that, you know, in order for you

to uncover kind of the true underlying causal graph, let's say if, you know, you start

198

00:17:55,867 --> 00:18:07,833

with some assumptions, you don't know if X causes Y or Y causes X or, you know, or

something else, the way that you need to do this is by intervening in the system.

199

00:18:07,974 --> 00:18:08,304

Right.

200

00:18:08,304 --> 00:18:09,094

So

201

00:18:10,228 --> 00:18:24,923

You can only, in a sense, have causal conclusions if you have rich enough data and by rich

enough data we generally mean experimental or interventional data, right?

202

00:18:24,923 --> 00:18:33,286

So you're totally right in kind of drawing parallels in this, right?

203

00:18:33,286 --> 00:18:34,946

And indeed we may...

204

00:18:35,330 --> 00:18:41,225

design experiments that actually maximize information about the underlying causal graph,

right?

205

00:18:41,225 --> 00:18:51,232

So if you don't know the graph and you want to uncover the graph, you can set up a

Bayesian experimental design framework that will allow you to, you know, select, let's

206

00:18:51,232 --> 00:19:00,759

say, which nodes in my causal graph should I intervene on, with what value should I

intervene on, so that with as few experiments as possible, with as few interventions as

207

00:19:00,759 --> 00:19:01,900

possible,

208

00:19:02,072 --> 00:19:05,604

can I actually uncover the true, the ground truth, right?

209

00:19:05,604 --> 00:19:09,645

The true underlying causal graph, right?

210

00:19:09,946 --> 00:19:17,449

And, you know, kind of the main thing that you're optimizing for is this notion of

information content.

211

00:19:17,449 --> 00:19:24,533

So how much information is each intervention, each experiment actually bringing us, right?

212

00:19:24,873 --> 00:19:25,993

And...

213

00:19:27,346 --> 00:19:33,481

And I think that's part of the reason why I find the Bayesian framework quite appealing as

opposed to, I guess, non-Bayesian frameworks.

214

00:19:33,481 --> 00:19:38,396

You know, it really centers around this notion of information gathering.

215

00:19:38,396 --> 00:19:50,566

And with the Bayesian model, we have a very precise definition of or a precise way to

measure an information content of an experiment.

216

00:19:50,566 --> 00:19:51,506

Right.

217

00:19:52,448 --> 00:19:53,908

If you think about

218

00:19:56,106 --> 00:20:00,799

Imagine again, we're trying to learn some parameters in a model, right?

219

00:20:03,714 --> 00:20:13,992

The natural, again, once we have the Bayesian model, the natural way to define information

content of an experiment is to look at, you know, what is our uncertainty about these

220

00:20:13,992 --> 00:20:16,094

parameters under our prior, right?

221

00:20:16,094 --> 00:20:17,705

So we start with a prior.

222

00:20:18,486 --> 00:20:26,052

We have uncertainty that is embedded or included in our prior beliefs.

223

00:20:26,052 --> 00:20:30,045

We're going to perform an experiment to collect some data, right?

224

00:20:30,045 --> 00:20:33,602

So perform an experiment, collect some data.

225

00:20:33,602 --> 00:20:36,363

we can update our prior to a posterior.

226

00:20:37,643 --> 00:20:41,804

So that's classic Bayesian inference, right?

227

00:20:41,804 --> 00:20:48,666

And now we can compare the uncertainty of that posterior to the uncertainty of our prior.

228

00:20:49,826 --> 00:20:58,049

And the larger the drop, the better our experiment is, the more informative our experiment

is.

229

00:20:58,389 --> 00:21:00,930

And so the best...

230

00:21:01,026 --> 00:21:09,993

or the optimal experiment in this framework would be the one that maximizes this

information gain.

231

00:21:09,993 --> 00:21:19,660

So the reduction in entropy, we're going to use entropy as a measure of uncertainty in

this framework.

232

00:21:21,722 --> 00:21:27,266

So it is the experiment that reduces our entropy the most.

233

00:21:29,272 --> 00:21:30,562

Does that make sense?

234

00:21:31,563 --> 00:21:32,503

Yeah.

235

00:21:32,743 --> 00:21:33,423

Total sense.

236

00:21:33,423 --> 00:21:34,483

Yeah.

237

00:21:34,824 --> 00:21:35,684

Total sense.

238

00:21:35,684 --> 00:21:36,534

That's amazing.

239

00:21:36,534 --> 00:21:37,784

I didn't know.

240

00:21:38,005 --> 00:21:43,216

So yeah, I mean, that's, that's pretty natural then to include the causal framework into

that.

241

00:21:43,216 --> 00:21:54,330

And I think that's one of the most powerful features of experimental design, because I

guess most of the time what you want to do when you design an experiment is you want to

242

00:21:54,330 --> 00:21:55,650

intervene.

243

00:21:55,662 --> 00:22:01,202

on a causal graph and see actually if your graph is close to reality or not.

244

00:22:01,322 --> 00:22:02,252

So that's amazing.

245

00:22:02,252 --> 00:22:11,682

And I love the fact that you can use experimental design to validate or invalidate your

causal graph.

246

00:22:11,682 --> 00:22:13,962

That's really amazing.

247

00:22:14,082 --> 00:22:15,022

Correct.

248

00:22:15,022 --> 00:22:15,722

100%.

249

00:22:15,722 --> 00:22:18,522

But I do want to stress that

250

00:22:19,982 --> 00:22:30,752

The notion of causality is not necessary for the purposes of describing what Bayesian

experimental design is.

251

00:22:31,393 --> 00:22:35,097

I'll give you a couple of other examples, actually.

252

00:22:35,097 --> 00:22:39,621

So you may...

253

00:22:45,140 --> 00:22:49,992

You may want to do something like model calibration.

254

00:22:50,113 --> 00:22:56,316

Let's say you have a simulator with a few parameters that you can tweak, right?

255

00:22:57,697 --> 00:23:00,779

So that it, I don't know, produces the best outcomes, right?

256

00:23:00,779 --> 00:23:05,231

Or is optimally calibrated for the thing that you're trying to measure, right?

257

00:23:05,231 --> 00:23:12,315

It is like, again, I don't think you need, you know, any concepts of causality here,

right?

258

00:23:12,315 --> 00:23:14,366

It's you're turning a few knobs.

259

00:23:14,366 --> 00:23:29,494

And you know, again, you can formulate this as an experimental design problem where, you

you are trying to calibrate your system with as few kind of no turns as possible.

260

00:23:29,494 --> 00:23:30,885

Yeah.

261

00:23:35,448 --> 00:23:38,741

Yeah, yeah, yeah.

262

00:23:38,741 --> 00:23:40,422

That makes a ton of sense.

263

00:23:40,622 --> 00:23:51,150

Something I'm curious about hearing you talk is, and that's also something you've worked

extensively on, is the computational challenges.

264

00:23:52,112 --> 00:23:53,122

Can you talk about that?

265

00:23:53,122 --> 00:24:04,906

What are the computational challenges associated with traditional bed, so Bayesian

experimental design, and how they affect the feasibility?

266

00:24:04,906 --> 00:24:07,418

of bed in real world applications.

267

00:24:07,418 --> 00:24:08,328

Yeah.

268

00:24:09,009 --> 00:24:10,720

Yeah, that's that's an excellent point.

269

00:24:10,720 --> 00:24:11,120

Actually.

270

00:24:11,120 --> 00:24:15,253

I Yeah, I see you read some of of my papers.

271

00:24:16,374 --> 00:24:18,836

So, all right.

272

00:24:18,836 --> 00:24:24,870

So all of these kind of information objectives.

273

00:24:24,870 --> 00:24:33,356

So what I just described, you know, we can look at the information content, we can

maximize information, and so on, like, it's all very natural.

274

00:24:34,080 --> 00:24:38,692

And it's all very mathematically precise and beautiful.

275

00:24:39,314 --> 00:24:46,179

But working with those information, theoretical objectives is quite difficult in practice.

276

00:24:47,101 --> 00:24:58,049

And the reason for that is precisely as you say, they're extremely computationally costly

to compute or to estimate, and they're even more computationally costly to optimize.

277

00:24:58,290 --> 00:25:04,098

And the careful listener would have noticed that I mentioned posterior inference.

278

00:25:04,098 --> 00:25:04,658

Right.

279

00:25:04,658 --> 00:25:13,364

So in order to compute the information content of an experiment, you actually need to

compute a posterior.

280

00:25:14,325 --> 00:25:14,766

Right.

281

00:25:14,766 --> 00:25:17,507

You need to compute a posterior given your data.

282

00:25:18,668 --> 00:25:24,952

Now, where the problem lies is that you need to do this before you have collected your

data.

283

00:25:25,473 --> 00:25:26,214

Right.

284

00:25:26,214 --> 00:25:32,802

Because you designing an experiment and then only you will be performing it and then

observing the outcome and then you can do.

285

00:25:32,802 --> 00:25:34,843

the actual posterior update.

286

00:25:35,003 --> 00:25:51,733

Now, what we have to then do is look at our prior entropy minus our posterior entropy and

integrate over all possible outcomes that we may observe under the selected experiment.

287

00:25:51,733 --> 00:25:56,655

And we have to do that for a number of experiments to actually find the optimal one.

288

00:25:57,716 --> 00:26:01,818

So what we end up with is this sort of nesting

289

00:26:01,974 --> 00:26:02,934

of expectations.

290

00:26:02,934 --> 00:26:09,319

So we have an expectation, we have an average with respect to all possible outcomes that

we can observe.

291

00:26:09,319 --> 00:26:20,547

And inside of our expectation, inside of this average, we have this nasty posterior

quantity that, generally speaking, is intractable.

292

00:26:20,687 --> 00:26:30,604

Unless you're in a very specific case where you have a conjugate model, where your

posterior is available in close form, you actually don't have access to that posterior.

293

00:26:31,362 --> 00:26:38,825

which means that you will need to do some form of approximation, right?

294

00:26:38,825 --> 00:26:45,398

Whether it's exact like MCMC or is going to be a variational posterior computation.

295

00:26:45,398 --> 00:26:48,980

Again, there is many ways of doing this.

296

00:26:48,980 --> 00:26:59,438

The point is that for each design that you may want to try, you need to compute all of

these posteriors.

297

00:26:59,438 --> 00:27:03,511

for every sample of your potential outcome, right?

298

00:27:03,511 --> 00:27:07,024

Of your possible outcome under the experiment.

299

00:27:07,345 --> 00:27:16,193

So what I was just describing is what is known as a doubly intractable quantity, right?

300

00:27:16,193 --> 00:27:25,801

So again, this podcast audience is very familiar with Bayesian inference and how Bayesian

inference is intractable in general.

301

00:27:25,801 --> 00:27:27,266

Now computing...

302

00:27:27,266 --> 00:27:41,426

the EIG, the sort of computing the objective function that we generally use in Bayesian

experimental design is what is known as doubly intractable objective, which is quite

303

00:27:41,426 --> 00:27:44,718

difficult to work with in practice, right?

304

00:27:46,140 --> 00:27:55,526

Now, what this means for sort of real world applications is that you either need to throw

a lot of compute

305

00:27:56,430 --> 00:27:57,461

on the problem.

306

00:27:57,461 --> 00:28:04,394

Or you need to do, you know, some, you need to sort of give up on the idea of being

Bayesian optimal, right?

307

00:28:04,394 --> 00:28:08,195

You may use some heuristics or something else.

308

00:28:08,195 --> 00:28:19,860

And what this problem really becomes limiting is when we start to think about, you know,

running experiments in real time, for example.

309

00:28:19,860 --> 00:28:24,462

So the survey example that I started with, you know,

310

00:28:24,748 --> 00:28:30,381

you know, asking participants in your survey, do you prefer somebody now or somebody

later?

311

00:28:30,601 --> 00:28:44,529

You know, it becomes quite impractical for you to, you know, run all these posterior

inferences and optimize all of these information theoretic objectives in between

312

00:28:44,529 --> 00:28:45,599

questions, right?

313

00:28:45,599 --> 00:28:50,852

So it's a little bit, you know, I asked you the first question now, let me run by MCMC.

314

00:28:50,852 --> 00:28:54,168

Let me optimize some doubly intractable objective.

315

00:28:54,168 --> 00:28:55,869

Can you just wait five minutes, please?

316

00:28:55,869 --> 00:28:58,902

And then I'll get back to you with the next question.

317

00:28:58,902 --> 00:29:04,846

Obviously, it's not something that you can realistically do in practice.

318

00:29:05,708 --> 00:29:18,798

So I think, historically, the computational challenge of the objectives that we use for

Bayesian experimental design has really...

319

00:29:20,166 --> 00:29:26,521

limited the feasibility of applying these methods in kind of real-world applications.

320

00:29:29,014 --> 00:29:36,888

And how, so how did you, which innovations, which work did you do on that front?

321

00:29:37,169 --> 00:29:40,234

That make all that better.

322

00:29:43,062 --> 00:29:43,462

Right.

323

00:29:43,462 --> 00:29:52,755

So there is a few things that I guess we can discuss here.

324

00:29:52,755 --> 00:29:56,986

So number one, I mentioned posterior inference, right?

325

00:29:56,986 --> 00:30:08,509

And I mentioned we have to do many posterior inference approximations for every possible

outcome of our experiment.

326

00:30:09,409 --> 00:30:12,822

Now, I think it was the episode with Marvin.

327

00:30:12,822 --> 00:30:15,925

right, where you talked about amortized Bayesian inference.

328

00:30:16,006 --> 00:30:23,595

So in the context of Bayesian experimental design, amortized Bayesian inference plays a

very big role as well, right?

329

00:30:23,595 --> 00:30:30,983

So one thing that we can do to sort of speed up these computations is to learn a...

330

00:30:35,950 --> 00:30:49,594

to learn a posterior that is amortized over all the outcomes that we can observe, all the

different outcomes that we can observe, right?

331

00:30:49,974 --> 00:30:56,416

And the beautiful part is that we know how to do that really well, right?

332

00:30:56,416 --> 00:31:02,618

So we have all of these very expressive, variational families.

333

00:31:02,860 --> 00:31:14,039

that we can pick from and optimize with data that we simulate from our underlying Bayesian

model.

334

00:31:14,600 --> 00:31:25,689

So this aspect of Bayesian experimental design definitely touches on related fields of

amortized Bayesian inference and simulation-based inference.

335

00:31:25,689 --> 00:31:32,632

So we're using simulations from our model to learn an approximate posterior.

336

00:31:32,632 --> 00:31:44,812

that we can very quickly draw samples from, as opposed to having to fit an HMC for every

new data set that we may observe.

337

00:31:48,782 --> 00:31:50,342

That makes sense.

338

00:31:50,342 --> 00:31:50,842

Yeah.

339

00:31:50,842 --> 00:31:59,782

And so I will refer listeners to the episode with Marvin, episode 107, where we dive into

amortized patient inference.

340

00:31:59,782 --> 00:32:01,322

put that in the show notes.

341

00:32:01,322 --> 00:32:09,502

I also put for reference three other episodes where we mentioned, you know, experimental

design.

342

00:32:09,502 --> 00:32:17,912

So episode 34 with Lauren Kennedy, 35 with Paul Burkner and 45 with Frank Harrell, that

one.

343

00:32:17,912 --> 00:32:20,603

focuses more on clinical trial design.

344

00:32:20,683 --> 00:32:27,404

But that's going to be very interesting to people who are looking to these.

345

00:32:27,945 --> 00:32:39,448

And yeah, so I can definitely see how amortized patient inference here can be extremely

useful based on everything you used it before.

346

00:32:40,028 --> 00:32:47,730

Maybe do you have an example, especially I saw that during your PhD,

347

00:32:48,290 --> 00:32:56,773

You worked on policy-based patient experimental design and you've developed these methods.

348

00:32:56,773 --> 00:33:05,195

Maybe that will give a more concrete idea to listeners about what all of these means.

349

00:33:06,195 --> 00:33:06,956

Exactly.

350

00:33:06,956 --> 00:33:15,480

One way in which we can speed up computations is by utilizing, as I said, amortized

variational inference.

351

00:33:15,480 --> 00:33:23,317

Now this will speed up the estimation of our information theoretic objective, but we still

need to optimize it.

352

00:33:23,618 --> 00:33:28,382

Now, given that we have to do after each experiment iteration, right?

353

00:33:28,382 --> 00:33:37,090

So we have collected our first data point, we have collected our first data point with a

need to...

354

00:33:37,886 --> 00:33:45,670

update our model and with this new model under this new model, updated model, we need to

kind of decide what to do next.

355

00:33:46,731 --> 00:33:50,163

Now, this is clearly also very computationally costly, right?

356

00:33:50,163 --> 00:33:59,518

The optimization step of our information theoretic objective is quite computationally

costly, meaning that it is very hard to do in real time, right?

357

00:33:59,518 --> 00:34:03,090

Again, going back to the survey example, you still can do it, right?

358

00:34:03,090 --> 00:34:06,762

You can estimate it a little bit more quickly, but you still can't optimize it.

359

00:34:07,086 --> 00:34:13,040

And this is where a lot of my PhD work has actually been focused on, right?

360

00:34:13,040 --> 00:34:19,155

So developing methods that will allow you to run Bayes Bayesian Optimal Design in real

time.

361

00:34:19,155 --> 00:34:20,736

Now, how are we going to do that?

362

00:34:20,736 --> 00:34:28,301

So there is a little bit of a conceptual shift in the way that we think about designing

experiments, right?

363

00:34:29,102 --> 00:34:35,046

What we will do is rather than choosing

364

00:34:35,838 --> 00:34:43,684

the design, the single design that we're going to perform right now, right in this

experiment iteration.

365

00:34:43,684 --> 00:35:01,976

What we're going to do is learn or train a design policy that will take as an input our

experimental data that we have gathered so far, and it will produce as an output the

366

00:35:01,976 --> 00:35:04,778

optimal design for the next experiment iteration.

367

00:35:05,016 --> 00:35:07,658

So our design policy is just a function, right?

368

00:35:07,658 --> 00:35:15,612

It's just a function that takes past experimental data as an input and produces the next

design as an output.

369

00:35:15,853 --> 00:35:17,293

Does that make sense?

370

00:35:17,514 --> 00:35:18,654

Yeah, yeah, yeah.

371

00:35:18,654 --> 00:35:23,357

I can see what that means.

372

00:35:23,497 --> 00:35:27,830

How do you integrate that though?

373

00:35:27,830 --> 00:35:30,961

You know, like I'm really curious concretely.

374

00:35:31,142 --> 00:35:31,604

Yeah.

375

00:35:31,604 --> 00:35:43,543

what does integrating all those methods, so a multi-spatial inference, variational

inference to the Bayesian experimental design, and then you have the Bayesian model that

376

00:35:43,984 --> 00:35:46,065

underlies all of that.

377

00:35:46,206 --> 00:35:50,309

How do you do that completely?

378

00:35:50,309 --> 00:35:53,211

Yes, excellent.

379

00:35:53,552 --> 00:35:57,435

When we say the model, I generally mean the underlying Bayesian model.

380

00:35:57,435 --> 00:36:00,697

This is our model that we use to train our

381

00:36:01,189 --> 00:36:03,871

let's say, variational amortized posterior.

382

00:36:03,871 --> 00:36:08,835

And this is the same model that we're going to train our design policy network.

383

00:36:08,835 --> 00:36:13,999

And I already said it's a design policy network, which means that we're going to be using,

again, deep learning.

384

00:36:13,999 --> 00:36:26,027

We're going to be using neural networks to actually learn a very expressive function that

will be able to take our data as an input, produce the next design as an output.

385

00:36:26,208 --> 00:36:28,329

Now, how we do that concretely?

386

00:36:29,922 --> 00:36:46,775

There is, you know, by now a large number of architectures that we can pick that is

suitable for, you know, our concrete problem that we're considering.

387

00:36:46,916 --> 00:36:59,766

So one very important aspect in everything that we do is that our policy, our neural

network should be able to take variable size data sets as an input.

388

00:37:00,364 --> 00:37:00,644

Right?

389

00:37:00,644 --> 00:37:08,877

Because every time we're calling our policy, every time we want a new design, we will be

feeding it with the data that we have gathered so far.

390

00:37:08,877 --> 00:37:09,737

Right?

391

00:37:09,737 --> 00:37:19,880

And so it is quite important to be able to condition on or take as an input variable

length sequences.

392

00:37:20,621 --> 00:37:21,161

Right?

393

00:37:21,161 --> 00:37:22,922

And so concretely, how can we do that?

394

00:37:22,922 --> 00:37:23,942

Well, you

395

00:37:24,794 --> 00:37:38,540

One kind of standard way of doing things is to basically take our experimental data that

we've gathered so far and embed each data point.

396

00:37:38,540 --> 00:37:42,821

So we have an X for our design, a Y for our outcome.

397

00:37:43,062 --> 00:37:48,924

Take this pair and embed it to a fixed dimensional representation, right, in some latent

space.

398

00:37:49,024 --> 00:37:51,765

Let's say with a small neural network, right?

399

00:37:51,785 --> 00:37:53,726

And we do that for each

400

00:37:53,866 --> 00:37:56,977

individual design outcome pair, right?

401

00:37:56,977 --> 00:38:10,370

So if we have n design outcome pairs, we're gonna end up with n fixed dimensional

representations after we have embedded all of this data.

402

00:38:11,151 --> 00:38:17,493

Now, how can we then produce the next sort of optimal design for the next experiment

iteration?

403

00:38:17,493 --> 00:38:22,894

There is many choices, and I think it will very much depend on the application.

404

00:38:24,035 --> 00:38:30,440

So certain Bayesian models, certain underlying Bayesian models are what we call

exchangeable, right?

405

00:38:30,440 --> 00:38:39,169

So the data conditional on the parameters can be, the data conditional on the parameters

is IID, right?

406

00:38:39,169 --> 00:38:46,606

Which means that the order in which our data points arrive doesn't matter.

407

00:38:46,606 --> 00:38:49,298

And again, the survey example.

408

00:38:49,722 --> 00:38:54,135

is quite a good example of this precisely, right?

409

00:38:54,135 --> 00:39:06,894

Like it doesn't really matter which question we ask first or second, you know, we can

interchange them and the outcomes will be unaffected.

410

00:39:06,954 --> 00:39:18,412

This is very different to time series models where, you know, if we design, if we are

choosing the time points at which to take blood pressure, right, for example,

411

00:39:18,870 --> 00:39:28,175

If we decide to take blood pressure at t equals five, we cannot then go back and take the

blood pressure at t equals two.

412

00:39:28,556 --> 00:39:34,559

So the choice of architecture will very much depend on, as I said, the underlying problem.

413

00:39:34,760 --> 00:39:45,358

And generally speaking, we have found it quite useful to explicitly embed the structure

that is known.

414

00:39:45,358 --> 00:39:56,986

So if we know that our model is exchangeable, we should be using an appropriate

architecture, which will ensure that the order of our data doesn't matter.

415

00:39:56,986 --> 00:40:07,032

If we have a time series, we can use an architecture that takes into account the order of

the data.

416

00:40:07,032 --> 00:40:08,713

So for the first one, we have...

417

00:40:12,088 --> 00:40:21,192

kind of standard architecture such as I don't know how familiar the audience would be, but

you know, in deep learning, there is an architecture called deep sets, right?

418

00:40:21,192 --> 00:40:25,964

So basically, take our fixed dimensional representations and we simply add them together.

419

00:40:25,964 --> 00:40:27,415

Very simple, right?

420

00:40:27,415 --> 00:40:30,746

Okay, we have our end design outcome pairs.

421

00:40:31,267 --> 00:40:38,179

We add them together, they're all of them are in in the same fixed dimensional

representation, we add them together.

422

00:40:38,179 --> 00:40:39,454

Now this is our

423

00:40:39,454 --> 00:40:43,775

representation or a summary of the data that we have gathered so far.

424

00:40:44,135 --> 00:40:52,937

We take that and we maybe map it through another small neural network to produce the next

design.

425

00:40:53,418 --> 00:41:03,240

If we have a time series model, then we can, you know, pass everything through an LSTM or

some form of recurrent neural network to then produce the next design.

426

00:41:03,240 --> 00:41:08,542

And that will keep sort of the order in

427

00:41:09,196 --> 00:41:12,287

and it will take the order into account.

428

00:41:13,868 --> 00:41:18,850

Did that answer your question in terms of like how specifically we think about these

policies?

429

00:41:19,070 --> 00:41:19,670

Yeah.

430

00:41:19,670 --> 00:41:20,911

Yeah, that's fascinating.

431

00:41:20,911 --> 00:41:31,075

And so basically, and we talked about that a bit with Marvin already, but the choice of

neural network is very important depending on the type of data because if you have, many

432

00:41:31,075 --> 00:41:32,856

time series are complicated, right?

433

00:41:32,856 --> 00:41:38,198

Like they already are, even if you're not using a neural network, time is always

complicated to work with.

434

00:41:38,286 --> 00:41:39,997

because there is an autocorrelation, right?

435

00:41:39,997 --> 00:41:43,809

So you have to be very careful.

436

00:41:44,389 --> 00:41:48,571

So basically that means changing the neural network you're working with.

437

00:41:49,692 --> 00:42:05,421

then so concretely, like what, you know, for practitioners, someone who is listening to us

or watching us on YouTube, and they want to start implementing BED in their projects,

438

00:42:05,421 --> 00:42:07,402

what's practical advice

439

00:42:07,402 --> 00:42:11,183

you would have for them to get started?

440

00:42:11,203 --> 00:42:21,166

Like how, why, and also when, you know, because there may be some moments, some cases

where you don't really want to use BED.

441

00:42:21,166 --> 00:42:25,787

And also what kind of packages you're using to actually do that in your own work.

442

00:42:25,787 --> 00:42:30,948

So that's a big question, I know, but like, again, repeat it as you give the answers.

443

00:42:31,369 --> 00:42:33,229

Yeah, yeah, yeah.

444

00:42:33,269 --> 00:42:36,850

Let me start with kind of...

445

00:42:36,872 --> 00:42:50,596

If people are looking to implement BASE in their projects, I think it is quite important

to sort of recognize where BASE experimental design is applicable, right?

446

00:42:50,596 --> 00:42:57,038

So it can be applied whenever we can construct an appropriate model for our experiments,

right?

447

00:42:57,038 --> 00:43:05,794

So the modeling part, like the underlying BASE model is actually doing a lot of the heavy

lifting in sort of in this framework.

448

00:43:05,794 --> 00:43:11,379

simply because this is basically what we're to assess quality of the designs, right?

449

00:43:11,379 --> 00:43:17,544

So the model is informing what a valuable information is.

450

00:43:17,704 --> 00:43:21,877

And so I would definitely advise not to gloss over that part.

451

00:43:21,877 --> 00:43:30,394

If your model is bad, if your model doesn't represent the data generating process, in

reality, the results might be quite poor.

452

00:43:36,359 --> 00:43:46,636

Now, I think it's also good to mention that you don't need to know the exact probability

distribution of the outcomes of the experiment, right?

453

00:43:46,636 --> 00:43:49,998

So you can, you know, as long as you can simulate, right?

454

00:43:49,998 --> 00:43:58,444

So you can have a simulator-based model that simply samples outcomes of the experiment,

given the experiment.

455

00:43:58,950 --> 00:44:01,502

which I think, you know, it simplifies things a little bit.

456

00:44:01,502 --> 00:44:11,660

You know, don't have to write down exact probability distributions, but still you need to

be able to sample or simulate this outcome.

457

00:44:13,822 --> 00:44:16,654

So that would be step number one, right?

458

00:44:16,654 --> 00:44:24,460

So ensuring that you have a decent model that you can start sort of experimenting with,

you know, in the sense of like...

459

00:44:24,460 --> 00:44:31,752

designing the policies or like training the policies or sort of designing experiments.

460

00:44:34,913 --> 00:44:50,232

The actual implementation aspect in terms of software, unfortunately, based on

experimental design is not as well developed.

461

00:44:50,232 --> 00:44:55,894

from software point of view as, for example, amortized Bayesian inferences, right?

462

00:44:55,894 --> 00:45:06,267

So I'm sure that you spoke about the baseflow package with Marvin, which is a really

amazing sort of open source effort.

463

00:45:06,988 --> 00:45:14,690

They have done a great job of implementing many of the kind of standard architectures that

you can basically, you know,

464

00:45:15,416 --> 00:45:22,109

pick whatever works or like pick something that is relatively appropriate for your problem

and it will work out, right?

465

00:45:22,369 --> 00:45:30,092

I think that is like a super powerful, super powerful framework that includes, know,

latest and greatest architectures in fact.

466

00:45:30,533 --> 00:45:43,318

Unfortunately, we don't have anything like this for Bayesian experimental design yet, but

I am in touch with the Baystow guys and I'm definitely looking into

467

00:45:43,424 --> 00:45:48,716

implementing some of these experimental design workflows in their package.

468

00:45:48,716 --> 00:46:02,913

So I have it on my to-do list to actually write a little tutorial in Baseflow, how you can

use Baseflow and your favorite deep learning framework of choice, whether it's PyTorch or

469

00:46:02,913 --> 00:46:09,985

JAX or like whatever, TensorFlow, to train sort of a...

470

00:46:11,562 --> 00:46:23,870

a policy, a design policy along with all of the amortized posteriors and all the bells and

whistles that you may need to run, you know, some pipeline like that.

471

00:46:25,132 --> 00:46:25,412

Right.

472

00:46:25,412 --> 00:46:28,934

So I mentioned the modeling aspect, I mentioned the software aspect.

473

00:46:30,275 --> 00:46:33,437

I think thinking about the problem.

474

00:46:35,162 --> 00:46:40,904

in like other aspects of like, you going to run an adaptive experiment or are you going to

run a static experiment?

475

00:46:40,904 --> 00:46:41,154

Right.

476

00:46:41,154 --> 00:46:45,275

So adaptive experiments are much more complicated than static experiments.

477

00:46:45,275 --> 00:46:51,167

So in an adaptive experiment, you're always conditioning on the data that you have

gathered so far, right?

478

00:46:51,167 --> 00:47:01,489

In a static experiment, you just design a large batch of experiments and then you run it

once, you collect your data and then you do your Bayesian analysis from there.

479

00:47:01,650 --> 00:47:02,090

Right.

480

00:47:02,090 --> 00:47:03,130

And so

481

00:47:04,227 --> 00:47:21,961

I generally always recommend starting with the simpler case, figure out whether the

simpler case works, do the proof of concept on a static or non-adaptive type of Bayesian

482

00:47:21,961 --> 00:47:22,932

experimental design.

483

00:47:22,932 --> 00:47:32,850

And then, and only then, start to think about, let me train a policy or let me try to do

an adaptive experimental design.

484

00:47:33,074 --> 00:47:34,295

pipeline.

485

00:47:34,776 --> 00:47:46,004

I think this is a bit of a common pitfall if I may say, like people tend to like jump to

the more complicated thing before actually figuring out kind of the simple case.

486

00:47:46,024 --> 00:47:57,653

Other than that, I think, again, I think it's a kind of an active area of research to, you

know, figure out ways to evaluate your designs.

487

00:47:57,653 --> 00:47:59,438

I think by now we have

488

00:47:59,438 --> 00:48:03,460

pretty good ways of evaluating the quality of our posteriors, for example, right?

489

00:48:03,460 --> 00:48:11,853

You have various posterior diagnostic checks and so on that doesn't really exist as much

for designs, right?

490

00:48:11,853 --> 00:48:15,634

So what does it like, you know, I've maximized my information objective, right?

491

00:48:15,634 --> 00:48:18,526

I have collected as much information as I can, right?

492

00:48:18,526 --> 00:48:20,587

According to this information objective.

493

00:48:20,587 --> 00:48:22,988

But what does that mean in practice, right?

494

00:48:22,988 --> 00:48:28,376

Like there is no kind of real world information that I can...

495

00:48:28,376 --> 00:48:29,176

test with, right?

496

00:48:29,176 --> 00:48:33,528

Like if you're doing predictions, can, you can predict, can observe and then compare,

right?

497

00:48:33,528 --> 00:48:39,429

And you can compute an accuracy score or a root mean squared error or like whatever makes

sense.

498

00:48:39,549 --> 00:48:45,771

There doesn't really exist anything like this in, in sort of in, in, in design, right?

499

00:48:45,771 --> 00:48:53,153

So it becomes much harder to quantify the success of such a pipeline.

500

00:48:53,153 --> 00:48:55,712

And I think it's, it's, it's a super interesting

501

00:48:55,712 --> 00:48:57,864

area for development.

502

00:48:57,864 --> 00:49:00,346

It's part of the reason why I work in the field.

503

00:49:00,346 --> 00:49:16,048

I think there is many open problems that if we figure out, I think we can advance the

field quite a lot and make data gathering an actual thing, principled and robust and

504

00:49:16,048 --> 00:49:21,562

reliable so that you run your expensive pipeline, but you end up with

505

00:49:22,314 --> 00:49:30,498

you kind of want to be sure that the day that you end up with is actually useful for the

purposes that you want to use it.

506

00:49:30,498 --> 00:49:32,128

yeah, did that answer the question?

507

00:49:32,128 --> 00:49:38,901

So you have the modeling aspect, you have the software aspect, which we are developing, I

think, you know, we will hopefully eventually get there.

508

00:49:39,862 --> 00:49:42,663

Think about your problem, start simple.

509

00:49:43,703 --> 00:49:45,324

Try to think about diagnostics.

510

00:49:45,324 --> 00:49:50,466

And I think, again, I mentioned, you know, it's very much an open

511

00:49:50,582 --> 00:49:59,506

an open problem, but maybe for your concrete problem at hand, you might be able to kind of

intuitively say, this looks good or this doesn't look good.

512

00:49:59,506 --> 00:50:07,229

Automating this is like, it's a very interesting open problem and something that I'm

actively working on.

513

00:50:07,810 --> 00:50:08,490

Yeah.

514

00:50:08,490 --> 00:50:09,130

Yeah.

515

00:50:09,130 --> 00:50:13,112

And thank you so much for all the work you're doing on that because I think it's super

important.

516

00:50:13,112 --> 00:50:18,314

I'm really happy to see you on the base flow side because yeah, those guys are doing

517

00:50:18,742 --> 00:50:19,943

Amazing work.

518

00:50:19,943 --> 00:50:29,156

There is the new version that's now been merged on the dev branch, which is back in

agnostics.

519

00:50:29,156 --> 00:50:35,069

So people can use it with their preferred deep learning package.

520

00:50:35,069 --> 00:50:44,913

So I always forget the names, but TensorFlow, PyTorch, and JAX, I'm guessing.

521

00:50:44,913 --> 00:50:48,478

I'm mostly familiar with JAX because that's the one.

522

00:50:48,478 --> 00:50:53,841

and a bit of PyTorch because that's the ones we're interacting with in PymC.

523

00:50:54,482 --> 00:50:57,324

This is super cool.

524

00:50:57,324 --> 00:51:01,367

I've linked to the Baseflow documentation in the show notes.

525

00:51:01,367 --> 00:51:14,277

Is there maybe, I don't know, a paper, a blog post, something like that you can link

people to with a workflow of patient experimental design and that way people will get an

526

00:51:14,277 --> 00:51:16,098

idea of how to do that.

527

00:51:17,134 --> 00:51:23,398

So hopefully by the time the episode is out, I will have it ready.

528

00:51:23,418 --> 00:51:28,572

Right now I don't have anything kind of practical.

529

00:51:28,572 --> 00:51:38,628

I'm very happy to send some of the kind of review papers that are out there on Bayesian

Experimental Design.

530

00:51:39,650 --> 00:51:46,402

hopefully in the next couple of weeks I'll have the tutorial, like a very basic

introductory tutorial.

531

00:51:46,402 --> 00:51:56,385

you know, we have a simple model, we have our simple parameters, you know, what we want to

learn, here is how you define your posteriors, here is how we define your policy, you

532

00:51:56,385 --> 00:52:02,566

know, and then you switch on base flow, and then you know, voila, you have you have you

have your results.

533

00:52:02,566 --> 00:52:11,009

So yeah, I'm hoping to get a blog post like of this of this sort done in in the next

couple of weeks.

534

00:52:11,009 --> 00:52:13,099

So once ready, I will thank you.

535

00:52:13,099 --> 00:52:14,000

I will thank you with that.

536

00:52:14,000 --> 00:52:14,700

Yeah, for sure.

537

00:52:14,700 --> 00:52:16,106

yeah, for sure.

538

00:52:16,106 --> 00:52:30,674

Can't wait and Marvin and I are gonna start working on setting up a modeling webinar

Amazing which is a you know another format I have on the show So this is like, you know,

539

00:52:30,674 --> 00:52:44,521

I'm like Marvin will welcome and share his screen and show how to do the the amortized

patient inference workflow with base flow also using pinc and all that cool stuff now that

540

00:52:44,521 --> 00:52:45,762

the new API

541

00:52:45,844 --> 00:52:52,819

is merged, we're going to be able to work on that together and set up the modeling

webinar.

542

00:52:52,819 --> 00:52:56,411

So listeners, definitely stay tuned for that.

543

00:52:56,411 --> 00:53:02,345

I will, of course, announce the webinar a bit in advance so that you all have a chance to

sign up.

544

00:53:02,345 --> 00:53:06,728

And then you can join live, ask questions to Marvin.

545

00:53:06,749 --> 00:53:08,230

That's going to be super fun.

546

00:53:08,230 --> 00:53:15,304

And mainly see how you would do Amortize Bayesian inference, concretely.

547

00:53:15,746 --> 00:53:16,426

Great.

548

00:53:16,426 --> 00:53:17,107

Amazing.

549

00:53:17,107 --> 00:53:19,238

Sounds fun.

550

00:53:19,238 --> 00:53:21,589

Yeah, that's going to be super fun.

551

00:53:21,830 --> 00:53:31,005

Something I was thinking about is that your work mentions enabling real-time design

decisions.

552

00:53:31,005 --> 00:53:33,366

And that sounds really challenging to me.

553

00:53:33,366 --> 00:53:42,101

So I'm wondering how critical is this capability in today's data-driven decision-making

processes?

554

00:53:42,561 --> 00:53:43,542

Yeah.

555

00:53:44,570 --> 00:53:46,821

I do think it really is quite critical, right?

556

00:53:46,821 --> 00:53:56,715

In most kind of real world practical aspects, practical problems, you really do need to

run to make decisions fairly quickly.

557

00:53:56,755 --> 00:53:57,716

Right?

558

00:53:58,116 --> 00:54:10,241

Again, all the surveys as an example, you know, you have anything that involves a human,

and you want to adapt as you're performing the experiment, you kind of need to ensure that

559

00:54:10,241 --> 00:54:11,361

things are

560

00:54:13,058 --> 00:54:15,339

you're able to run things in real time.

561

00:54:15,780 --> 00:54:30,162

And honestly, I think part of the reason why we haven't seen a big explosion or based on

experimental design in practice is partly because we couldn't until recently actually run

562

00:54:30,162 --> 00:54:41,836

these things fast enough, both because of the computational challenges, now that we know

how to do amortize inference very well, now that we know how to train policies.

563

00:54:41,836 --> 00:54:43,677

that will produce designs very well.

564

00:54:43,677 --> 00:54:50,149

I am expecting things to, you know, to improve, right?

565

00:54:50,149 --> 00:54:57,971

And to start to see some of these some of these methods applied in practice.

566

00:54:58,391 --> 00:55:08,874

Having said that, I do think and please stop me if that's totally unrelated, but to make

things

567

00:55:09,772 --> 00:55:24,276

successful in practice, there are a few other things that in my opinion have to be

resolved before, you know, we're confident that we can, you apply, you know, such black

568

00:55:24,276 --> 00:55:28,527

boxes in a sense, right, because we have all these neural networks all over the place.

569

00:55:28,527 --> 00:55:38,320

And it's not entirely clear whether all of these things are robust to various aspects of

the complexities of the real world.

570

00:55:38,320 --> 00:55:38,680

Right.

571

00:55:38,680 --> 00:55:39,430

So

572

00:55:41,751 --> 00:55:43,911

things like model mis-specification, right?

573

00:55:43,911 --> 00:55:50,793

So is your Bayesian model actually a good representation of the thing that you're trying

to study?

574

00:55:50,793 --> 00:55:52,503

That's a big open problem again.

575

00:55:52,503 --> 00:56:00,386

And again, I'm going to make a parallel to Bayesian inference actually.

576

00:56:00,386 --> 00:56:09,518

For inference purposes, model mis-specification may not be as bad as it is for design

purposes.

577

00:56:09,600 --> 00:56:18,716

And the reason for that is you will still get valid under some of the assumptions, of

course, you will still get valid inferences or like you will still be close.

578

00:56:18,716 --> 00:56:25,100

You still get the best that you can do under the under the the the assumption of a wrong

model.

579

00:56:25,401 --> 00:56:29,663

Now, when it comes to design, we have absolutely no guarantees.

580

00:56:30,244 --> 00:56:39,430

And oftentimes we end up in very pathological situations where because we're using our

model to inform the data collection.

581

00:56:39,990 --> 00:56:44,812

and then to also evaluate, right, fix that model on the same data that we've gathered.

582

00:56:45,352 --> 00:56:55,236

If your model is misspecified, you might not even be able to detect the misspecification

because of the way that the data was gathered.

583

00:56:55,237 --> 00:56:56,497

It's not IID, right?

584

00:56:56,497 --> 00:57:01,559

Like it's very much a non-IID data collection process.

585

00:57:02,340 --> 00:57:07,882

And so I think when we talk about practical things,

586

00:57:07,882 --> 00:57:20,907

we really, really need to start thinking about how are we going to make our systems or the

methods that we develop a little bit more robust to misspecification.

587

00:57:20,907 --> 00:57:23,639

And I don't mean we should solve model misspecification.

588

00:57:23,639 --> 00:57:28,851

I think that's a very hard task that is basically unsolvable, right?

589

00:57:28,851 --> 00:57:31,462

Like it is solvable under assumptions, right?

590

00:57:31,462 --> 00:57:37,100

If you tell me what your misspecification is, you know, we can improve things, but in

general, this is not

591

00:57:37,100 --> 00:57:42,294

something that we can sort of realistically address uniformly.

592

00:57:44,416 --> 00:57:58,448

But yeah, so again, going back to practicalities, I do think it's of crucial importance to

sort of make our pipeline in diagnostics sort of robust to some forms of

593

00:57:58,448 --> 00:57:59,929

mis-specification.

594

00:58:00,910 --> 00:58:01,890

Yeah.

595

00:58:02,031 --> 00:58:03,642

Yeah, yeah, for sure.

596

00:58:03,642 --> 00:58:05,914

And that's where also

597

00:58:06,380 --> 00:58:13,013

I really love Amortized Patient Inference because it allows you to do simulation-based

calibration.

598

00:58:13,013 --> 00:58:27,449

And I find that especially helpful and valuable when you're working on developing a model

because already before fitting to data, you already have more confidence about what your

599

00:58:27,449 --> 00:58:34,162

model is actually able to do and not do and where the possible pain points would be.

600

00:58:34,162 --> 00:58:35,390

And I find that.

601

00:58:35,390 --> 00:58:36,830

super helpful.

602

00:58:39,612 --> 00:58:52,186

And actually talking about all that, I'm wondering where you see the future of Bayesian

experimental design heading, particularly with advancements in AI and machine learning

603

00:58:52,186 --> 00:58:53,197

technologies.

604

00:58:53,197 --> 00:58:53,787

Wow.

605

00:58:53,787 --> 00:58:54,537

Okay.

606

00:58:54,537 --> 00:58:57,238

So I do view this

607

00:58:57,784 --> 00:58:58,594

type of work.

608

00:58:58,594 --> 00:59:06,799

So this type of research is a little bit orthogonal to all of the developments in sort of

modern AI and machine learning.

609

00:59:06,799 --> 00:59:17,844

And the reason for this is that we can literally borrow the latest and greatest

development in machine learning and plug it into our pipelines.

610

00:59:18,295 --> 00:59:22,627

there is a better architecture to do X, right?

611

00:59:22,627 --> 00:59:26,810

Like we can take that architecture and, you know, utilize it for our purposes.

612

00:59:26,810 --> 00:59:27,660

So I think

613

00:59:27,660 --> 00:59:37,893

you know, when it comes to the future of Bayesian experimental design, given, you know,

all of the advancements, I think this is great because it's kind of helping the field even

614

00:59:37,893 --> 00:59:38,523

more, right?

615

00:59:38,523 --> 00:59:50,456

Like we have more options to choose from, we have better models to choose from, and kind

of the data gathering aspect will always be there, right?

616

00:59:50,456 --> 00:59:56,258

Like we will always want to collect better data for the purposes of, you know, our data

analysis.

617

00:59:57,858 --> 01:00:13,302

And so, you know, the design aspect will still be there and with the better models, we'll

just be able to gather better data, if that makes sense.

618

01:00:13,322 --> 01:00:15,263

Yeah, that definitely makes sense.

619

01:00:15,343 --> 01:00:16,763

Yeah, for sure.

620

01:00:16,763 --> 01:00:18,074

And that's interesting.

621

01:00:18,074 --> 01:00:24,706

Yeah, I didn't anticipate that kind of answer, that's okay.

622

01:00:24,706 --> 01:00:25,926

definitely.

623

01:00:26,644 --> 01:00:27,815

see what you mean.

624

01:00:27,815 --> 01:00:36,759

Maybe before like, yeah, sorry, but even if you think, you know, now you're everybody's

gonna have their AI assistant, right?

625

01:00:36,800 --> 01:00:46,945

Now, wouldn't it be super frustrating if your AI assistant takes three months to figure

out what you like for breakfast?

626

01:00:47,046 --> 01:00:51,408

And like, it's experimenting or like, it's just randomly guessing.

627

01:00:51,638 --> 01:00:54,080

do you like fish soup for breakfast?

628

01:00:54,080 --> 01:00:54,850

Like,

629

01:00:54,850 --> 01:00:59,993

How about I prepare you a fish soup for breakfast or like, or I propose you something like

that, right?

630

01:00:59,993 --> 01:01:06,396

And so I think again, like this personalization aspect, right?

631

01:01:06,396 --> 01:01:12,539

Like again, kind of sticking to, I don't know, personal AI assistance, right?

632

01:01:13,801 --> 01:01:19,033

The sooner or the quicker they are able to learn about your preferences, the better that

is.

633

01:01:19,033 --> 01:01:21,095

And again, you know, we're learning about preferences.

634

01:01:21,095 --> 01:01:23,968

Again, I'm gonna refer back to the...

635

01:01:23,968 --> 01:01:28,640

you know, time value of many preference learning, like it is just a more complicated

version of that.

636

01:01:28,721 --> 01:01:29,241

Right.

637

01:01:29,241 --> 01:01:41,048

And so if your latest and greatest AI assistant is able to learn and customize itself to

your preferences much more quickly than otherwise, you know, that's a huge win.

638

01:01:41,048 --> 01:01:41,338

Right.

639

01:01:41,338 --> 01:01:47,652

And I think this is precisely where all these sort of principle data gathering techniques

can really shine.

640

01:01:48,733 --> 01:01:51,390

Once we figure out, you know, the

641

01:01:51,390 --> 01:01:57,903

the sort of the issues that I was talking about, I mean, that makes sense.

642

01:01:57,903 --> 01:02:13,099

Maybe to play us out, I'm curious if you have something like applications in mind,

practical applications of bed that you've encountered in your research, particularly in

643

01:02:13,099 --> 01:02:19,362

the fields of healthcare or technology that you found particularly impactful.

644

01:02:19,362 --> 01:02:20,302

Right.

645

01:02:20,746 --> 01:02:21,947

Excellent question.

646

01:02:21,947 --> 01:02:29,113

And actually, I was going to mention that aspect as, you know, where you see the future of

Bayesian experimental design.

647

01:02:30,114 --> 01:02:43,666

Part of kind of our blind spots, if I may refer to that as sort of blind spots, is that in

our research so far, we have very much focused on developing methods, developing

648

01:02:43,666 --> 01:02:47,689

computational methods to sort of make some of these

649

01:02:48,898 --> 01:02:55,843

based on experimental design pipelines actually feasible to run in practice.

650

01:02:56,023 --> 01:03:08,212

Now, we haven't really spent much time working with practitioners, and this is a gap that

we're actively trying to sort of close.

651

01:03:09,013 --> 01:03:13,727

In that spirit, we have a few applications in mind.

652

01:03:13,727 --> 01:03:15,357

We sort of apply people.

653

01:03:16,120 --> 01:03:19,361

particularly in the context of healthcare, as you mentioned.

654

01:03:19,361 --> 01:03:24,613

So clinical trials design is a very big one.

655

01:03:24,714 --> 01:03:43,782

So again, things like getting to the highest safe dose as quickly as possible, For, again,

being personalized to the human, given their context, given their various characteristics.

656

01:03:43,930 --> 01:03:50,202

is one area where we're looking to sort of start some collaborations and explore this

further.

657

01:03:50,202 --> 01:04:06,357

Our group in Oxford have a new PhD student joining that will be working in collaboration

with biologists to actually design experiments for something about cells.

658

01:04:06,357 --> 01:04:08,958

I don't know anything about biology, so.

659

01:04:09,998 --> 01:04:15,580

I'm not the best person to actually describe that line of work.

660

01:04:15,841 --> 01:04:23,324

But hopefully there will be some concrete, exciting applications in the near future.

661

01:04:23,324 --> 01:04:25,404

So that's applications in biology.

662

01:04:25,404 --> 01:04:37,369

And finally, you know, there's constantly lots of different bits and pieces like, you

know, people from chemistry saying, hey, I have this thing, can we...

663

01:04:39,646 --> 01:04:45,118

Can we work on maybe, you know, performing or like setting up a basic experimental design

pipeline?

664

01:04:45,118 --> 01:04:51,540

I think the problem that we've had so far or I've had so far is just lack of time.

665

01:04:51,540 --> 01:04:54,551

There's just so many things to do and so little time.

666

01:04:54,991 --> 01:05:04,294

But I am very much actively trying to find time in my calendar to actually work on a few

applied projects because I do think, you know,

667

01:05:04,614 --> 01:05:08,706

It's all like developing all these methods is great, right?

668

01:05:08,706 --> 01:05:10,827

I mean, it's very interesting math that you do.

669

01:05:10,827 --> 01:05:13,019

It's very interesting coding that you do.

670

01:05:13,019 --> 01:05:19,492

But at the end of the day, you kind of want these things to make someone life's better,

right?

671

01:05:19,492 --> 01:05:34,510

Like a practitioner that will be able to save some time or save some money or, you know,

improve their data gathering and therefore, you know, downstream analysis much better.

672

01:05:34,530 --> 01:05:38,013

and more efficient thanks to some of this research.

673

01:05:38,013 --> 01:05:42,586

So I hope that answered your question in terms of concrete applications.

674

01:05:42,586 --> 01:05:44,437

think we'll see more of that.

675

01:05:44,437 --> 01:05:53,203

But so far, you know, the two things are, yeah, clinical trial design that we're exploring

and some of this biology cell stuff.

676

01:05:53,203 --> 01:05:56,575

Yeah, yeah, no, that's worrying.

677

01:05:56,575 --> 01:05:59,237

that's, I mean, definitely looking forward to it.

678

01:05:59,237 --> 01:06:01,098

That sounds absolutely fascinating.

679

01:06:01,098 --> 01:06:03,870

yeah, if you can make that.

680

01:06:04,278 --> 01:06:08,839

happening in important fields like that, that's going to be extremely impactful.

681

01:06:08,839 --> 01:06:10,830

Awesome, Desi.

682

01:06:10,830 --> 01:06:14,091

I've already...

683

01:06:14,091 --> 01:06:22,033

Do you have any sort of applications that you think design, basic experimental design

might be suitable?

684

01:06:22,033 --> 01:06:27,805

I know you're quite experienced in various modeling aspects based in modeling.

685

01:06:27,805 --> 01:06:30,625

So, yeah, do you have anything in mind?

686

01:06:31,025 --> 01:06:33,086

Yeah, I mean a lot.

687

01:06:33,580 --> 01:06:37,151

So marketing, know is already using that a lot.

688

01:06:37,251 --> 01:06:37,721

Yeah.

689

01:06:37,721 --> 01:06:40,172

Clinical trials for sure.

690

01:06:40,292 --> 01:06:54,896

Also now that they work in sports analytics, well, definitely sports, you know, you could

include that into the training of elite athletes and design some experiments to actually

691

01:06:54,896 --> 01:07:03,788

test causal graphs and see if pulling that lever during training is actually something

that makes a difference during

692

01:07:04,379 --> 01:07:07,221

the professional games that actually count.

693

01:07:07,221 --> 01:07:15,348

yeah, I can definitely see that having a big impact in the sports realm, for sure.

694

01:07:15,348 --> 01:07:15,858

Nice.

695

01:07:15,858 --> 01:07:24,245

Well, if you're open to collaborations, can do some designing of experiments once you up

your sports models.

696

01:07:24,245 --> 01:07:24,916

Yeah.

697

01:07:24,916 --> 01:07:25,877

Yeah.

698

01:07:25,877 --> 01:07:26,237

Yeah.

699

01:07:26,237 --> 01:07:32,488

mean, as soon as I work on that, I'll make sure to reach out because that's going to be...

700

01:07:32,488 --> 01:07:36,419

It's definitely something I'm work on and dive into.

701

01:07:36,419 --> 01:07:41,100

So that's gonna be fascinating to work on that with you for sure.

702

01:07:42,161 --> 01:07:43,981

Sounds fun, yeah.

703

01:07:44,181 --> 01:07:46,002

Yeah, exactly.

704

01:07:46,102 --> 01:07:47,702

Very, very exciting.

705

01:07:48,283 --> 01:07:49,593

well, thanks, Stacey.

706

01:07:49,593 --> 01:07:50,763

That was amazing.

707

01:07:50,763 --> 01:07:52,514

I think we covered a lot of ground.

708

01:07:52,514 --> 01:07:56,045

I'm really happy because Alina had a lot of questions for you.

709

01:07:56,125 --> 01:08:00,106

But thanks a lot for keeping your answers very...

710

01:08:00,822 --> 01:08:05,806

focused and not getting distracted by all my decorations.

711

01:08:06,407 --> 01:08:12,512

Of course, I have to ask you the last two questions I ask every guest at the end of the

show.

712

01:08:12,512 --> 01:08:18,796

So first one, if you had unlimited time and resources, which problem would you try to

solve?

713

01:08:19,587 --> 01:08:21,279

it's a really hard one.

714

01:08:21,279 --> 01:08:24,812

Honestly, I because I know you asked those questions.

715

01:08:24,812 --> 01:08:26,353

I was like, what am going to say?

716

01:08:26,353 --> 01:08:27,654

I honestly don't know.

717

01:08:27,654 --> 01:08:29,205

There's so many things.

718

01:08:29,666 --> 01:08:30,546

But

719

01:08:31,530 --> 01:08:50,876

Again, I think it would be something of high impact for humanity in general, probably so

in climate change would be something that I would dedicate my unlimited time and

720

01:08:50,876 --> 01:08:52,086

resources.

721

01:08:52,287 --> 01:08:53,808

That's good answer.

722

01:08:54,068 --> 01:08:55,790

That's definitely a popular one.

723

01:08:55,790 --> 01:09:01,394

So you're in great company and I'm sure the team already working on that is going to be

very...

724

01:09:01,538 --> 01:09:03,659

be happy to welcome you.

725

01:09:04,179 --> 01:09:05,330

No, let's hope.

726

01:09:05,330 --> 01:09:09,321

I think we need to speed up the solutions.

727

01:09:09,982 --> 01:09:11,832

Like seeing what is happening, right?

728

01:09:11,832 --> 01:09:16,284

I think it's rather unfortunate.

729

01:09:17,325 --> 01:09:24,668

And second question, if you could have dinner with any great scientific mind, dead, alive

or fictional, who would it be?

730

01:09:25,028 --> 01:09:26,108

Yeah.

731

01:09:26,389 --> 01:09:28,629

So I think it will be Claude Shannon.

732

01:09:29,050 --> 01:09:30,028

So, you

733

01:09:30,028 --> 01:09:35,000

the godfather of information theory or like actually the father of information theory.

734

01:09:37,382 --> 01:09:49,949

Again, partly because a lot of my research is inspired by information theory principles in

Bayesian experimental design, but also outside of Bayesian experimental designs.

735

01:09:49,949 --> 01:09:58,074

It sort of underpins a lot of the sort of modern machine learning development, right?

736

01:09:58,074 --> 01:09:58,954

And

737

01:09:59,373 --> 01:10:14,313

What I think will be really quite cool is that if you were to have dinner with him, if I

were to have dinner with him and basically tell him like, hey, look at all these language

738

01:10:14,313 --> 01:10:15,953

models that we have today.

739

01:10:15,953 --> 01:10:21,373

Like Claude Shannon was the person that invented language models back in 1948, right?

740

01:10:21,373 --> 01:10:23,406

So that's many years ago.

741

01:10:23,406 --> 01:10:26,826

And like, literally not even having computers, right?

742

01:10:26,826 --> 01:10:34,246

So he would, he calculated things by hand and produced output that actually looks like

English, right?

743

01:10:34,246 --> 01:10:35,646

In 1948.

744

01:10:35,646 --> 01:10:44,196

And so I think, you know, a brilliant mind like him, you know, just seeing kind of the

progress that we've made since then.

745

01:10:44,196 --> 01:10:49,086

And like, we actually have language models on computers that behave like humans.

746

01:10:49,086 --> 01:10:53,346

I'll be super keen to hear like, what is next from him.

747

01:10:53,346 --> 01:10:57,470

And I think he will have some very interesting answers to that.

748

01:10:57,470 --> 01:11:08,088

What is the future for information processing and the path to artificial, I guess people

call it artificial general intelligence.

749

01:11:08,088 --> 01:11:12,601

So what would be the path to AGI from here onwards?

750

01:11:13,883 --> 01:11:15,364

Yeah, for sure.

751

01:11:15,364 --> 01:11:18,387

That'd be a fascinating dinner.

752

01:11:18,387 --> 01:11:20,067

Make sure it comes like that.

753

01:11:22,130 --> 01:11:22,510

Awesome.

754

01:11:22,510 --> 01:11:22,978

Well.

755

01:11:22,978 --> 01:11:24,419

Desi, thank you so much.

756

01:11:24,419 --> 01:11:27,881

think we can call it a show.

757

01:11:28,642 --> 01:11:29,823

I learned so much.

758

01:11:29,823 --> 01:11:38,689

I'm sure my listeners did too, because as you showed, is a very, this is a topic that's

very much on the frontier of science.

759

01:11:38,689 --> 01:11:42,131

So thank you so much for all the work you're doing on that.

760

01:11:42,651 --> 01:11:49,516

And as usual, I put resources and a link to your website in the show notes for those who

want to dig deeper.

761

01:11:49,516 --> 01:11:52,908

Thank you again, Desi, for taking the time and being on this show.

762

01:11:53,208 --> 01:11:54,659

Thank you so much for having me.

763

01:11:54,659 --> 01:11:56,277

It was my pleasure.

764

01:11:59,980 --> 01:12:03,691

This has been another episode of Learning Bayesian Statistics.

765

01:12:03,691 --> 01:12:14,184

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

766

01:12:14,184 --> 01:12:18,255

episodes to help you reach true Bayesian state of mind.

767

01:12:18,255 --> 01:12:20,216

That's learnbaystats.com.

768

01:12:20,216 --> 01:12:25,077

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lass and Meghiraam.

769

01:12:25,077 --> 01:12:28,238

Check out his awesome work at bababrinkman.com.

770

01:12:28,238 --> 01:12:29,430

I'm your host.

771

01:12:29,430 --> 01:12:30,481

Alex and Dora.

772

01:12:30,481 --> 01:12:34,594

can follow me on Twitter at Alex underscore and Dora like the country.

773

01:12:34,594 --> 01:12:41,899

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

774

01:12:41,899 --> 01:12:44,281

Thank you so much for listening and for your support.

775

01:12:44,281 --> 01:12:46,573

You're truly a good Bayesian.

776

01:12:46,573 --> 01:12:50,085

Change your predictions after taking information in.

777

01:12:50,085 --> 01:12:53,388

And if you're thinking I'll be less than amazing.

778

01:12:53,388 --> 01:12:56,940

Let's adjust those expectations.

779

01:12:56,974 --> 01:13:09,882

me show you how to be a good Bayesian Change calculations after taking fresh data in Those

predictions that your brain is making Let's get them on a solid foundation

Previous post