*Proudly sponsored by **PyMC Labs**, the Bayesian Consultancy. **Book a call**, or **get in touch**!*

In this episode, Dmitry Bagaev discusses his work in Bayesian statistics and the development of RxInfer.jl, a reactive message passing toolbox for Bayesian inference.

Dmitry explains the concept of reactive message passing and its applications in real-time signal processing and autonomous systems. He discusses the challenges and benefits of using RxInfer.jl, including its scalability and efficiency in large probabilistic models.

Dmitry also shares insights into the trade-offs involved in Bayesian inference architecture and the role of variational inference in RxInfer.jl. Additionally, he discusses his startup Lazy Dynamics and its goal of commercializing research in Bayesian inference.

Finally, we also discuss the user-friendliness and trade-offs of different inference methods, the future developments of RxInfer, and the future of automated Bayesian inference.

Coming from a very small town in Russia called Nizhnekamsk, Dmitry currently lives in the Netherlands, where he did his PhD. Before that, he graduated from the Computational Science and Modeling department of Moscow State University.

Beyond that, Dmitry is also a drummer (you’ll see his cool drums if you’re watching on YouTube), and an adept of extreme sports, like skydiving, wakeboarding and skiing!

*Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at **https://bababrinkman.com/** !*

**Thank you to my Patrons for making this episode possible!**

*Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser and Julio*.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

**Takeaways:**

– Reactive message passing is a powerful approach to Bayesian inference that allows for real-time updates and adaptivity in probabilistic models.

– RxInfer.jl is a toolbox for reactive message passing in Bayesian inference, designed to be scalable, efficient, and adaptable.

– Julia is a preferred language for RxInfer.jl due to its speed, macros, and multiple dispatch, which enable efficient and flexible implementation.

– Variational inference plays a crucial role in RxInfer.jl, allowing for trade-offs between computational complexity and accuracy in Bayesian inference.

– Lazy Dynamics is a startup focused on commercializing research in Bayesian inference, with the goal of making RxInfer.jl accessible and robust for industry applications.

**Links from the show:**

- LBS Physics & Astrophysics playlist: https://learnbayesstats.com/physics-astrophysics/
- LBS #51, Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton: https://learnbayesstats.com/episode/51-bernoullis-fallacy-crisis-modern-science-aubrey-clayton/
- Dmitry on GitHub: https://github.com/bvdmitri
- Dmitry on LinkedIn: https://www.linkedin.com/in/bvdmitri/
- RxInfer.jl, Automatic Bayesian Inference through Reactive Message Passing: https://rxinfer.ml/
- Reactive Bayes, Open source software for reactive, efficient and scalable Bayesian inference: https://github.com/ReactiveBayes
- LazyDynamics, Reactive Bayesian AI: https://lazydynamics.com/
- BIASlab, Natural Artificial Intelligence: https://biaslab.github.io/
- Dmitry’s PhD dissertation: https://research.tue.nl/en/publications/reactive-probabilistic-programming-for-scalable-bayesian-inferenc
*Effortless Mastery*, by Kenny Werner: https://www.amazon.com/Effortless-Mastery-Liberating-Master-Musician/dp/156224003X- The Book of Why, by Judea Pearl: https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
- Bernoulli’s Fallacy, by Aubrey Clayton: https://www.amazon.com/Bernoullis-Fallacy-Statistical-Illogic-Science/dp/0231199945
- Software Engineering for Science: https://www.amazon.com/Software-Engineering-Science-Chapman-Computational/dp/1498743854

**Transcript**

*This is an automatic transcript and may therefore contain errors. Please **get in touch** if you’re willing to correct them.*

##### Transcript

In this episode, Dmitry Bagaev discusses

his work in Bayesian statistics and the

2

development of RxInfer.jl, a reactive

message passing toolbox for Bayesian

3

inference.

4

Dmitry explains the concept of reactive

message passing and its applications in

5

real-time signal processing and autonomous

systems.

6

He discusses the challenges and benefits

of using RxInfer.jl, including

7

its scalability and efficiency in large

probabilistic models.

8

Dimitri also shares insight into the

trade-offs involved in Bayesian inference

9

architecture and the role of variational

inference in rxinfer.jl.

10

Additionally, he discusses his startup

Lazy Dynamics and its goal of

11

commercializing research in Bayesian

inference.

12

Finally, we also discussed the user

friendliness and trade-offs of different

13

inference methods, the future developments

of rxinfer,

14

and the future of automated patient

entrance.

15

Coming from a very small town in Russia

called Nizhny Komsk, Dmitry currently

16

lives in the Netherlands, where he did his

PhD.

17

Before that, he graduated from the

computational science and modeling

18

department of Moscow State University.

19

Beyond that, Dmitry is also a drummer,

you'll see his cool drums if you're

20

watching on YouTube, and an adept of

extreme sports like skydiving,

21

wakeboarding, and skiing.

22

Learning Basin Statistics, episode 100,

recorded January 25, 2024.

23

Dmitry Pagaev, welcome to Learning Basin

Statistics.

24

Thanks.

25

Thanks for inviting me for your great

podcast.

26

Really, I feel very honored.

27

Yeah, thanks a lot.

28

The honor is mine.

29

That's really great to have you on the

show.

30

So many questions for you and yeah, we're

also gonna be able to talk again about

31

Julia, so that's super cool.

32

And I wanna thank of course Albert

Podusenko for putting us in contact.

33

Thanks a lot Albert, it was a great idea.

34

I hope you will love the episode.

35

Well I'm sure you're gonna love Dmitry's

part, and mine is always...

36

more in the air, right?

37

And well, Dmitry, thanks again, because I

know you're a bit sick.

38

So I appreciate it even more.

39

And so let's start by basically defining

what you're doing nowadays, and also how

40

did you end up doing what you're doing

basically?

41

Yes.

42

So I'm currently working at the University

of Technology in bias lab.

43

And I just recently finished my PhD in

Bayesian statistics, essentially.

44

So now I'm just like supervised students.

45

I did some of the projects there and bias

lab itself is a group in the university

46

that primarily work on like a real time

Bayesian signal processing.

47

And we do research in that field.

48

And the slogan, let's say of the lab is

sort of like, is natural artificial

49

intelligence and it's phrased.

50

Uh, like specifically like that, because

there's, there cannot be natural

51

artificial intelligence.

52

So it's like a play words, let's say.

53

Um, and the, the lab is basically trying

to like develop automated, um, control

54

systems or like novel signal processing

applications.

55

And it's basically inspired by, uh,

neuroscience.

56

I know.

57

And we also opened a startup with my

colleagues.

58

which is called Lazy Dynamics.

59

And the idea is basically to commercialize

the research in the lab, but also to find

60

the new funding for new PG students for

the university.

61

But they're still quite young, so we are

still like less than one year, and we are

62

currently like in search of clients and

potential investors.

63

But yeah, my main focus still remains

being a postdoc in the university.

64

Yeah, fascinating.

65

So many things already.

66

Um, maybe what do you do in your postdoc?

67

Um, so my main focus, like primary is, uh,

supporting, uh, the toolbox that we wrote,

68

uh, in our lab that I am a primary author.

69

We call this toolbox, uh, RX and Ferb.

70

Uh, and this is like essential part of my

PhD project.

71

Um, and basically I love to code.

72

So, um, more or less like, uh,

73

my scientific career was always aligned

with software development.

74

And the Erikson FUR project was a really

big project and many other projects in

75

BiasLab, they depend on it.

76

And it requires maintenance, like box

fixing, adding new features, performance

77

improvements.

78

And, and we are currently have several sub

projects that we develop alongside for the

79

Erikson FUR.

80

And that's just like the main focus for

me.

81

And as something else, I also supervise

students for this project.

82

Yeah, yeah.

83

Of course.

84

That must also take quite some time,

right?

85

Yes, exactly.

86

Yeah.

87

Yeah, super cool.

88

So let me start basically by diving a bit

more into the concepts you've just named,

89

because you've already talked about a lot

of the things you work on, which is.

90

a lot, as I guess listeners can hear.

91

So first, let's try and explain the

concept of reactive message passing in the

92

context of Bayesian inference for

listeners who may not be familiar with it,

93

because I believe it's the first time we

really talk about that on the show.

94

So yeah, talk to us about that.

95

Also, because from what I understand, it's

really the main focus of your work, be it

96

through RxInfR.

97

infer.jl or lazy dynamics or biaslam.

98

So let's start by having the landscape

here about reactive message passing.

99

Yes, good.

100

So yeah, ARIKS and FER is what we call

reactive message passing based Bayesian

101

inference toolbox.

102

And basically in the context of Bayesian

inference, we usually work with

103

probabilistic models.

104

And the probabilistic model is usually a

function of some variables and some

105

variables are being observed.

106

And we want to infer some probability

distribution over unobserved variables.

107

And what is interesting about that is that

if we have a probabilistic model, we can

108

actually represent it as a graph.

109

And for example, if we can factorize our

probabilistic model into a set of factors,

110

such that each node will be a factor and

each edge will be a variable of the model,

111

more like hidden state, and some of them

are observed or not.

112

And basically message passing by itself is

a very interesting idea of solving Bayes

113

rule for a probabilistic model defined in

terms of the graph.

114

So it does it by sending messages between

nodes in the graph, along edges.

115

And it's quite a very big topic actually.

116

But essentially here to understand is that

we can do that, right?

117

So we can reframe the base rule as

something that has this messages in the

118

ground, uh, reactive message passing, uh,

is a particular implementation, uh, of

119

this idea.

120

So, because in the traditional message

passing, we usually have to define an

121

order of messages, like how, in what order

do we compute them?

122

It may be very crucial, for example, if

the graph structure has loops.

123

So there is like some structural

dependencies in the graph and reactive

124

message passing basically says, okay, no,

we will not do that.

125

We will not specify any order.

126

Instead we will react on data.

127

So, and, uh, the, the order of message

computations, uh, becomes essentially data

128

driven and we do not enforce any

particular, uh,

129

order of competition.

130

OK, so if I try to summarize, that would

be something like, usually when you work

131

on a Bayesian model, you have to specify

the graph and the order of the graph in

132

which direction the nodes are going.

133

In reactive message passing, it's more

like a non-parametric version in a way

134

where you just say, there are these stuff,

but you're not specifying the.

135

the directions and you're just trying to

infer that through the data.

136

How wrong is that characterization?

137

Not exactly like that.

138

So indeed the graph that we work with,

they don't have any direction in them,

139

right?

140

Because messages, they can flow in any

direction.

141

The main difference here is that reactive

message passing reacts on changes in data

142

and updates posteriors automatically.

143

Right?

144

So.

145

There is no particular order in which we

update the series.

146

For example, if we have some variables in

our mode, like ABC, we don't know which

147

will be updated first and which will be

the last.

148

It basically depends on our observations.

149

Uh, but, uh, it works like that, that as

soon as we have new observation, uh, the

150

graph reacts in this observation and

updates the series as soon as it can.

151

without explicitly specifying this order.

152

And why would you do that?

153

Why would that be useful?

154

So it's a very good question.

155

So because in BiasLab, we essentially work

with, we try to work with autonomous

156

systems.

157

And autonomous systems, they have to work

in the field, right?

158

So like in the real world environment,

let's say, right?

159

And

160

Real world environment is extremely

unpredictable.

161

If we want to, to be more clear, let's say

we try to develop a drone, which tries to

162

navigate the environment and it has like

several sensors and we want to build a

163

probabilistic model of the environment,

such that drones wants to act in this

164

environment and like in sensors, it has

some noise in it.

165

Like, uh, so essentially.

166

We cannot predict in what order the data

will be arriving, right?

167

Because you may have a video signal, you

may have an audio signal and this, um,

168

devices that record video, let's say they

also have unpredictable update rate.

169

Usually it's maybe like 60 frames per

second, but it may change.

170

Right.

171

Um, so instead of like fixing the

algorithm and saying, okay, we wait for

172

like new frame.

173

from a video, wait for a new frame from an

audio, then we update, then we wait again.

174

Instead of doing that, we just simply let

the system react on new changes and update

175

the series as soon as possible.

176

And then based on new posteriors, we act

as soon as possible.

177

This is kind of the main idea of reactive

implementations.

178

And in traditional software,

179

for Bayesian inference, for example, we

just have a model, and we have a data set,

180

and we feed the data set to the model, and

we have the posterior, and then we analyze

181

the posterior, and it also works really

great, right?

182

But it doesn't really work in the field

where you don't have time to synchronize

183

your data set and to react as soon as you

can.

184

Okay, okay, I see.

185

So that's where, basically,

186

This kind of reactive message passing is

extremely useful when you receive data in

187

real time that you don't really know the

structure of.

188

Yes, we work primarily with real-time

signals.

189

Yes.

190

Okay, very interesting.

191

Actually, do you have any examples, any

real-life examples that you've worked on

192

or...

193

You know, this is extremely useful to work

on with RxInfoR.jl or just in general,

194

these kind of relative messages passing.

195

Yes.

196

So I myself, I usually do not work with

applications.

197

So my primary focus lies in the actual

Bayesian inference engine.

198

But in our lab, there are people who work,

for example, on audio signals.

199

Right.

200

So you want to you want, for example,

maybe create a probabilistic model of

201

environment to be able to denoise speech

or it or it may be like a position

202

tracking system or a planning system in

real time.

203

In our lab, we also very often refer to

the term active inference.

204

which basically defines a probabilistic

model, not only of your environment, but

205

also of your actions, such that you can

infer the most optimal course of actions.

206

And this might be useful in control

applications, also for the drone, right?

207

So we want to infer not only the position

of the drone based on sensors that we

208

have, but also how it should act to avoid

an obstacle, for example.

209

I see.

210

Yeah, OK, super interesting.

211

So basically, any case where you have

really high uncertainty, right, that kind

212

of stuff, OK, yes, super interesting.

213

And so what prompted you to create a tool

for that?

214

What inspired you to develop our existing

Forto.jl?

215

And maybe also tell us how it differs from

traditional Bayesian inference tools.

216

be it in Python or in R or even in Julia.

217

If I'm a Julia user, I'm used to use

probabilistic programming language in

218

Julia, then what's the difference with

RxInfoR?

219

This is a good question.

220

But there are two questions in one about

inspiration.

221

So I joined the bias lab in 2019.

222

without really understanding what it is

going to be about.

223

So, but really understanding how difficult

it is really.

224

So, and the inspiration for me came from

the project that I started my PhD on.

225

And basically the main inspiration in our

lab is like the so-called the free energy

226

principle, which kind of tries to explain.

227

how natural biotic systems behave.

228

Right.

229

So, and they basically say they define

so-called Bayesian brain portesies and

230

pre-energy principles.

231

So they basically say that any biotic

system, they define a probabilistic model

232

of its environment and tries to infer the

most optimal course of action to survive

233

essentially.

234

But all of this is based on Bayesian

inference as well.

235

So, right.

236

At the end.

237

It kind of, it's a very good idea, but at

the end, it all boils down to the, to the

238

Bayesian inference.

239

And basically if you look how biotech

system work, we, we note that there are

240

very specific properties of this biotech

system.

241

So they do not consume a lot of power.

242

Right.

243

It's actually, it has been proven that our

brain consumes like about 20 Watts of

244

energy, right.

245

And it's like an ex.

246

extremely efficient device, if we can say,

right?

247

It does not even compare with

supercomputers.

248

It's also scalable because we live in the

very complex environment with many

249

variables.

250

We act in real time, right?

251

And we are able to adapt to the

environment.

252

And we are also kind of robust to what is

happening around us, right?

253

So...

254

If something new happens, we were able to

adapt to it instead of just failing.

255

Right.

256

And this is kind of the idea.

257

So the inspiration for this Bayesian

inference toolbox that we need to be

258

scalable, real time, adaptive, robust,

super efficient, and also low power.

259

Right.

260

So this is the main ideas behind RX

Inferior project.

261

And here we go to the second part of the

question.

262

How does it differ?

263

Because this is exactly where we differ,

right?

264

So other solutions in Python or in Julia,

also very cool.

265

There are actually a lot of cool libraries

for Bayesian inference, but most of them,

266

they have a different set of trades off or

requirements.

267

And maybe I will be super clear.

268

We are not trying to be better.

269

But we are trying to have a different set

of requirements for the Bayesian different

270

system.

271

Yeah.

272

Yeah, you're working on a different set of

needs, in a way.

273

Yes, yes.

274

And it's application-driven.

275

Yeah, you're trying to address another

type of applications.

276

Exactly.

277

And if we directly compare to other

solutions, they are mostly based on

278

sampling, like HMC or not.

279

Or maybe they are like black box methods

like a GVI, automatic differential

280

variation inference or VDI.

281

And they basically, they are great methods

that they tend to consume a lot of

282

computational power or like energy, right?

283

So they do a very expensive simulation.

284

It may run for maybe hours, maybe even

days in some situations.

285

And they were great, but you cannot really

apply it in this autonomous systems where

286

you need to...

287

Uh, like if we're again talking about

audio, it's like 44 kilohertz.

288

So we need to really perform Bayesian

inference and extremely fast scale.

289

And it seems you're not, uh, are not

really applicable in this situation.

290

So.

291

Yeah, fascinating.

292

And you were talking, well, we'll get back

to the computation part a bit later.

293

Maybe first I'd like to ask you, why did

you do it with Julia?

294

Why did you choose Julia for RxInfer?

295

And what advantages does it offer for your

applications of patient inference?

296

The particular choice of Julia was

actually driven by the needs of the bias

297

lab in the university because all the

research which we do in the university now

298

in our lab is done in Julia and that

decision has been made by our professor

299

many, many years ago.

300

Interestingly enough, our professor

doesn't really code.

301

But Julia is a really great language.

302

So if I would choose myself.

303

If I, I would still choose Julia.

304

It's, it's, it's a great language.

305

It's fast.

306

Right.

307

So, and our primary concern is efficiency.

308

Um, and like Python can also be fast.

309

Uh, if you like know how to use it, if you

use an MP or like some specialized

310

libraries, uh, but with July, it's, it's

really easy.

311

It is easier.

312

In some situations, of course, you need to

know a bit more.

313

So my background is in C and C++.

314

And I understand like how compilers works,

for example.

315

So maybe for me, it's a bit easier to

write a performance Julia code.

316

But in general, it's just, it's just

really, it's a nice, fast language.

317

And it also develops fast in the sense

that new versions of Julia, they,

318

come up like every several months.

319

And it really gets better with each

release.

320

Another thing which is actually very

important for us as well is macros.

321

Are macros in Julia?

322

So for people who are listening, so macros

basically allow us to apply arbitrary code

323

transformations to the existing code.

324

And it also allows you to create

sublanguage within a language.

325

And why it is particularly useful for us

is that specifying probabilistic models in

326

Bayesian inference is a bit hard or

tedious.

327

We don't want to directly specify these

huge graphs.

328

And instead, what we did and what Turing

also did and many other libraries in

329

Julia, they came up with the main specific

language for specifying probabilistic

330

programs.

331

And it's extremely cool.

332

So it's much, much simpler to define a

probabilistic program in Julia than in

333

Python, in my opinion.

334

And I really like this feature of Julia.

335

Yeah, these basically building block

aspect of the Julia language.

336

Yeah, yeah, I've heard that.

337

There are other aspects I can mention of

Julia.

338

By the way, maybe I also can make an

announcement regarding Julia is that the

339

next Julia the con is happening in I'm

told in the city where I'm currently in.

340

And it's going to be very cool.

341

It's going to be in PC stadium in the

football stadium.

342

Right.

343

The technical is the technical conference

about programming language is going to be

344

on the stadium.

345

So, but so another aspect.

346

about Julia is this notorious dynamic

multiple dispatch.

347

And it was extremely useful for us in

particular for reactive message passing

348

implementation.

349

Because again, so if we think about how

this reactiveness work and how do we

350

compute these messages on the graph, in

order to compute the message, we wait for

351

inputs.

352

And then when all inputs have arrived, we

have to decide

353

how to compute the message.

354

And computation of the message is

essentially solving an integral.

355

But if we know types of the arguments, and

if we know the type of the node, it might

356

be that there is an analytical solution to

the message.

357

So it's not really necessary to solve a

complex integral.

358

And we do it by multiple dispatch in

Julia.

359

So multiple dispatch in Julia helps us to

pick the most efficient message update

360

rule.

361

on the graph, and it's basically built

into the language.

362

It's also possible to emulate it in

Python, but in Julia, it's just fast and

363

built-in, and it works super nice.

364

No idea.

365

Yeah, super cool.

366

Yeah, for sure.

367

Super interesting points.

368

And I'm very happy because it's been a

long time since we've had a show with some

369

Julia practitioners, so that's always very

interesting to hear of what's going on in

370

that.

371

in that field and yeah, I would be

convinced just by coming to PSV Eindhoven

372

Stadium.

373

You don't have to tell me more.

374

I'll be there.

375

Let's do a live show in the stadium.

376

Yes, I will be there.

377

Yeah.

378

Yeah, that sounds like a lot of fun.

379

And actually, so I'm myself an open source

developer, so I'm very biased to ask you

380

that question.

381

What were some of the biggest challenges

you faced when you developed RxInfer?

382

And how did you overcome them?

383

I guess that's like the main thing you do

when you're an open source developer is

384

putting a tire.

385

This is an amazing question.

386

I really like it.

387

So, and I even have like some of the

answers in my PhD dissertation.

388

And I will probably just go ahead.

389

I'll probably just quote it, but I don't

remember exactly how I framed it.

390

But I took it from the book, which is

called, um, uh, software engineering for

391

science.

392

So, and basically it says that people

usually underestimate how difficult it is

393

to create, um, a software in scientific

research area.

394

Uh, and the main difficulty with that is

that there are no clear guidelines to

395

follow.

396

Uh, it's not like designing a website with

clear, like a framework rules and you just

397

need tasks between like people and team.

398

No, it's like, um, new insights of

science, like, or like an area where we

399

work in that they happen every day.

400

Right.

401

And the requirements for the software,

they may change every day.

402

Uh, and it's really hard to like come up

with a specific design before we start

403

developing because

404

requirements change over time because you

may create some software for research

405

purposes and then you found out something

super cool which works better or faster or

406

scales better and then you realize that

well you actually have to start over

407

because this is just better we just we

just found out something cooler and

408

It also means that a developer must invest

time into this research.

409

So it's not only about coding, like you

should understand how it all works from

410

the scientific point of view, from a

mathematical point of view.

411

And sometimes if this is like a cutting

edge research, there are no books about

412

how it works, right?

413

So we must invest time in reading papers.

414

Um, and also being able to write a good

code, which is fast and efficient.

415

Uh, and all of these problems, they, they

also cured, uh, when we developed our

416

extinfer, uh, even though I'm the main

author, uh, a lot of people have helped

417

me, right?

418

It's like, uh, very thankful for that.

419

Uh, and for our extinfer in particular,

for my, I also needed to learn a very big

420

part of statistics because when I joined

the lab,

421

I actually didn't have a lot of experience

with Bayesian inference and with graphs

422

and with message passing.

423

So I really need to dive into this field.

424

And many people helped me to understand

how it works.

425

A lot of my colleagues, they have spent

their time explaining.

426

And even though, right, so we have already

this stack of difficulties at the end or

427

like maybe not at the end, but the

software that we use, we would like it to

428

be.

429

Easy to use, like, or user friendly.

430

So we already have this difficulties about

we don't know how to design it.

431

We have to invest time into reading

papers.

432

But then we at the end, we want to have a

functional software that is easy to use,

433

addresses different needs and allows you

to find new insights.

434

So the software should be designed such

that it does not.

435

impose a lot of constraints on what you

can do with this software, right?

436

Because scientific software is about

finding new insights, not about like doing

437

some predefined set of algorithm.

438

You want to find something new

essentially.

439

And software should help you with that.

440

Yeah, yeah, for sure.

441

That's a good point.

442

What do you think, what would you say are

the key challenges in achieving

443

scalability and efficiency in this

endeavor and how does RxInfair address

444

this?

445

Basically, we are talking in the context

of Bayesian inference and the key

446

challenge in

447

the base rule doesn't scale, right?

448

It's, the formula looks very simple, but

in practice, then we start working with

449

large probabilistic models.

450

Just blind application of base rule

doesn't scale because it has exponential

451

complexity with respect to the number of

variables.

452

And Arikson-Ford tries to tackle this

by...

453

having essentially two main components in

the recipe, like maybe three, let's say

454

three.

455

So first of all, we use factor graphs to

specify the model.

456

So we work with factorized models.

457

We work with message passing, and message

passing essentially converts the

458

exponential complexity of the Bayes rule

to linear, but only for highly factorized

459

models.

460

And like highly factorized here is a

really crucial component, but many models

461

are indeed highly factorized.

462

It's it means that Variables do not

directly depend on all other variables.

463

They directly depend on maybe a very small

subset of variables in the model.

464

And the third component here is

variational inference.

465

So because it allows us to trade off the

computational complexity with accuracy.

466

So if the task is too difficult or it

doesn't scale, basically what variational

467

inference gives you is the ability to

impose a set of constraints into your

468

problem, because it reframes the original

problem as an optimization task.

469

And we can optimize with up to a certain

constraint.

470

For example, we may say that this variable

is distributed as a Gaussian distribution.

471

It may not be true in reality and we lose

some accuracy, but at the end it allows us

472

to solve some equations faster.

473

And we can impose more and more

constraints if we don't have enough

474

computational power and if you have large

model, or we may relax constraints if we

475

have enough computational power and we

gain accuracy.

476

So we have this sort of a slider.

477

which allows us to scale better.

478

But here's the thing, right?

479

We always can come up with such a large

model with so many variables and so

480

difficult relationships between variables

where it still will not scale.

481

And this is fine.

482

But Alexin Fur tries to push this boundary

for like scaling Bayesian inference to

483

large models.

484

And actually, so you're using variational

inference quite a lot in this endeavor,

485

right?

486

So actually, can you discuss the role of

variational inference here in RxInfer and

487

maybe any innovations that you've

incorporated in this area?

488

So the role I kind of touched upon a

little bit is that it acts as like a

489

slider.

490

Right.

491

In in the controlling the complexity and

the accuracy of your inference result.

492

This is the main role.

493

Of course, for some applications, this

might be undesirable.

494

For some applications, you may want to

have a perfect posterior estimation.

495

But for some applications, it's not a very

big deal.

496

Again, we are talking about different

needs for different application.

497

here.

498

And the innovation that RX and Fer brings,

I think it's like one of the few

499

implementation as message passing, like

variational inference as message passing,

500

because it's usually implemented as like

black box method that takes a function

501

like a probabilistic model function and

maybe does some automatic differentiation

502

or some extra sampling under the hood.

503

And message passing by itself has a very

long history, but I think people

504

mistakenly think that it's quite limited

to like some product algorithm.

505

But actually, variational inference can

also be implemented as message passing.

506

And it's quite good.

507

So it opens the applicability of the

message passing algorithms.

508

And also.

509

As we already talked a little bit about

this reactive nature of the inference

510

procedure, so it's also maybe even the

first reactive variational inference

511

engine, which is designed to work with

infinite data streams.

512

So it continuously updates this posterior

continuously does minimization.

513

It does not stop.

514

And as soon as new data arrive, we

basically update our posteriors.

515

But in between this kind of data windows,

we can spend more computational resources

516

to find better approximation for the

variational inference.

517

But yeah, but all other solutions, let's

say that are also variational inference,

518

they basically require you to, yeah.

519

to wait for the data, then feed to the

data, or wait for the entire data set,

520

feed the data set, and then you have the

result, then you analyze the result, and

521

then you repeat.

522

So RxInfoR works a bit differently in that

regard.

523

Yeah.

524

Fascinating.

525

And that, I'm guessing you have some

examples of that up in the RxInfoR

526

website, maybe we can...

527

a link to that in the shows for people who

are interested to see how you would apply

528

that in practice?

529

So I,

530

So it does not really require reactivity,

but because it's kind of like easy to use

531

and fast, students can do some homework

for signal processing applications.

532

What I already mentioned is that we work

with audio signals and with control

533

applications.

534

I don't really have a particular example

if our sensor is being used in the field.

535

or by an industry.

536

So it's primarily our research tool

currently, but we want to extend it.

537

So it's still a bit more difficult to use

than Turing, let's say.

538

Turing, which is also written in Julia,

because yeah, message passing is a bit

539

maybe more difficult to use and it is not

that universal as HMC and NUTS still

540

require some approximation methods.

541

Yeah.

542

So we still use it as a research tool

currently, but we have some ideas in the

543

lab, how to expand the available set of

probabilistic models we can run an

544

inference on.

545

And yes, indeed, on our documentation, we

have quite a lot of examples where we can

546

use, but these examples, they are, I would

say, educational in most of the cases.

547

at least in the documentation.

548

So we are at this stage where we have a

lot of ideas how we can improve the

549

inference, how we make it faster, such

that we can actually apply it for real

550

tasks, like for real drones, for real

robots, to make a real speech, like the

551

noise or something similar.

552

Yeah, definitely said.

553

That would be super interesting, I'm

guessing, for people who are into these

554

and also just want to check out.

555

I have been checking out your website

recently to prepare for the episode.

556

Actually, can you now...

557

So you've shared some, like the overview

of the theory, how that works, what

558

RxInfer does in that regard.

559

Can you share what you folks are doing

with Lazy Dynamics, how that's related to

560

that?

561

How does that fit into this ecosystem?

562

So yeah, Lazy Dynamics, we created this

company to commercialize the research that

563

we do at our lab to basically find funding

to make our extrovert better and ready for

564

industry.

565

Because currently, let's say,

566

Ericsson is a great research tool for our

purposes, right?

567

But industry needs some more properties to

the addition that I have already

568

mentioned.

569

Right?

570

For example, indeed the Bayesian inference

engine must be extremely robust, right?

571

It does not allow to fail if we really

work in the field.

572

And this is not really a research

question.

573

It's more about like implementational

side.

574

Right.

575

It's like a good goal to good code

coverage, like great documentation.

576

And this is what we kind of also want to

do with lazy dynamics.

577

We want to take this next step and want to

create a great product for other

578

companies, especially that can rely on Rx

and Fur in the maybe in their research or

579

maybe even in the field.

580

Right.

581

And maybe we create some sort of a tools,

a tool set around RxInfer that will allow

582

you to maybe debug the performance of your

probabilistic problem or your

583

probabilistic inference, right?

584

It's also not about research.

585

It's about like having it more accessible

to other people, like finding bugs or

586

mistakes in their model specification,

make it easier to use.

587

Or maybe, for example, we could.

588

come up with some sort of a library of

models, right?

589

So you would want to build some autonomous

system and it may require a model for

590

audio recognition, it may require a model

for video recognition.

591

And this kind of set of models, they can

be predefined, very well tested, have a

592

great performance, super robust.

593

And basically Lazy Dynamics may provide an

access to this kind of a library.

594

right?

595

So, and for this kind of, because this is

not a research related questions, it's, it

596

must be done in a company with like a very

good programmers and very good code

597

coverage and documentation.

598

But for research purposes, Ericsson-Fer is

already a great toolbox.

599

And basically many students in our lab,

they already use it.

600

But.

601

Yeah, because we are all sitting in the

same room, let's say on the same floor, we

602

can kind of brainstorm, find bugs, fix it

on the fly and they keep working that.

603

But if we want Rx, for Rxinfer to be used

in industry, it really needs to be a

604

professional toolbox with like a

professional support.

605

Yeah.

606

Yeah, I understand that makes sense.

607

Surprised you can, I don't know when you

sleep though, between the postdoc, the

608

open source project and the company.

609

So yeah, it's a great comment, but yeah,

it's hard.

610

Yeah, hopefully we'll get you some sleep

in the coming months.

611

To get back to your PhD project, because I

found that very interesting.

612

So your dissertation will be in the show

notes.

613

But something I was also curious about is

that in this PhD project, you explore

614

different trade-offs for Bayesian

inference architecture.

615

And you've mentioned that a bit already a

bit earlier, but I'm really curious about

616

that.

617

So could you elaborate on these trade-offs

and why they are significant?

618

Yes, we already touched a little bit about

that.

619

So the main trade-offs here are kind of

computational load, efficiency,

620

adaptivity, high power consumption, magic.

621

Yeah.

622

And another aspect actually, which we

didn't talk about yet is structural model

623

adaptation.

624

So this is the requirements that we are

favor.

625

in the Ricks Center.

626

And this has the requirements that were

like central to my PhD project.

627

And this all arises, all of these

properties, they are not just coming from

628

a vacuum.

629

They are coming from real time signal

processing applications on autonomous

630

systems.

631

We don't have a lot of battery power.

632

We don't have a very powerful CPUs on this

autonomous devices, because essentially

633

what we want to do also is that

634

We want to be able to run a very

difficult, large probabilistic models on

635

the Raspberry Pi.

636

And Raspberry Pi doesn't even have a GPU.

637

So we can buy some small sort of a GPU and

put it on the Raspberry Pi.

638

But still, the computational capabilities

are very, very limited on edge devices.

639

For example, one may say, let's just do

everything in the cloud, which is a very

640

valid argument, actually.

641

But we also, in some situations, the

latencies are just too big.

642

And also, maybe we don't have access to

the internet in some areas, but we still

643

want to create these adaptive Bayesian

inference systems like a drone that they

644

may...

645

explore some area maybe in the mountain or

something where we don't really have an

646

internet so we cannot really process

anything in the cloud.

647

So it must work as efficient as possible.

648

On a very, very small device that doesn't

have a lot of power doesn't have a lot of

649

battery and still this should work in real

time.

650

Yeah, I think, I think this is mostly the

main trades of and

651

In terms of how we do it, we use this

variational inference and we sacrifice

652

accuracy with respect to scalability.

653

Reactive message passing allows us to

scale to a very large models because it

654

works on Factor graphs.

655

Yeah.

656

And I think that's, these are very

important points to make, right?

657

Because always when you work and you build

an open source,

658

package you have to trade off to make.

659

So that means you have to choose whether

you're going to a general package or a

660

more specified one.

661

And that will dictate in a way your trade

off.

662

In RxInfer, it seems like you're quite

specified, specialist of message passing

663

inference.

664

So the cool thing here is that I'm

665

choices because you're like, no, our main

use case is that.

666

And so we can use that.

667

And the devirational inference choice, for

instance, is quite telling because in your

668

case, it seems to be really working well,

whereas we could not do that in PMC, for

669

instance.

670

If we remove the ability to use HMC, we

would have quite a drop in the user

671

numbers.

672

So yeah, that's always something I'm.

673

Try to make people aware of when they are

using open source packages.

674

You can do everything.

675

Yeah, exactly.

676

Exactly.

677

So I actually really, when I have a need,

I really enjoy working with like HMC or

678

NAT based methods because they just work,

just like magic.

679

And, but, and here's the trade off, right?

680

They work magically in many situations.

681

But they're slow in some sense.

682

Let's say they're not slow, but they're

slower than a message button.

683

So here is this trade-off.

684

So user friendliness is really, really

important key in this equation.

685

Yeah, and what do you call user

friendliness in your case?

686

So what I refer to user friendliness here

is that a user can specify a model, press

687

a button with HMC and it just runs and the

user gets a result.

688

Yes, a user needs to wait a little bit

more.

689

But anyway, like user experience is great.

690

Just specify a model, just run inference,

just get your result.

691

With RxInfer, it's a bit less easier

because in most of the cases,

692

uh, message passing works like that, that

it favors like analytical solutions on the

693

graph.

694

And if analytical solution for a message

is not available, uh, basically a user

695

must specify an approximation method.

696

Uh, it actually also can be HMC, uh, just

in case.

697

Uh, but still our X and four does not

really specify a default approximation

698

method.

699

Currently,

700

fine default approximation, but because it

does not define it currently, if a user

701

specifies a complex probabilistic model,

it will probably throw an error saying

702

that, okay, I don't know how to solve it,

please specify what should I do here and

703

there.

704

And for a new user, it might be a bit

unintuitive how to do that, what to

705

specify.

706

So for HMC, there's no need to do it, it

just works.

707

But if RxInfer, it's not that easy yet.

708

That's what I was referring to,

user-friendliness.

709

Yeah, that makes sense.

710

And again, here, the interesting thing is

that the definition of user-friendliness

711

is going to depend on what you're trying

to optimize, right?

712

What kind of use case you're trying to

optimize on.

713

Yes.

714

Actually, what's the future for RxInfer?

715

What are the future developments or

enhancements that you are planning?

716

So, we have already touched a little bit

about like Lazy Dynamics side, which tries

717

to make a really, like a commercial

product out of the person, where we have

718

great support.

719

This is one side of the future, but we

also have a research side of the project.

720

And research side of the project includes

structural model adaptation.

721

We just, uh, which in my opinion is quite

cool.

722

So what it basically means in a few words

is that we, we may be able in the future

723

to change the structure of the model on

the fly without stopping the inference

724

procedure and you may need it for several

reasons, for example, uh, computational

725

power, uh, computational budget change,

and we are not longer able, we are no

726

longer able to run inference.

727

on such a complex model.

728

So we want to reduce the complexity of the

model.

729

We want to change the structure, maybe put

some less demanding factor nodes.

730

And we want to do it on the fly.

731

We want actually stopping the inference

because for like sampling based methods,

732

if we change the model, we basically are

forced to restart because we have this

733

change and it's quite difficult to reduce

the previous result if the structure of

734

the model change.

735

graphs, it's actually possible.

736

So another point why we would need that in

the field is that if you could imagine

737

different sensors, so we have different

observations, and one sensor all of a

738

sudden just burned out, or glitched, or

something like that.

739

So essentially, we are not longer having

this sort of observation.

740

So we need to change the structure of our

model

741

to account for this glitch or breakage of

the sensor.

742

And this is also where reactive message

passing helps us because we basically,

743

because we do not enforce the particular

order of updates, we stop reacting on this

744

observation because it's no longer

available.

745

And we also change the structure of the

model to account for that.

746

Another thing for the future of RxN4 in

terms of research is that we want to be

747

to support natively different update rates

for different variables.

748

And so what I mean by that is that if you

imagine an audio recognition system, let's

749

say, or audio enhancement system, let's

say, and you have you modeled the

750

environment of like a person who is

talking around several persons and let's

751

say their speech signal.

752

arise at the rate of like 44 kilohertz if

we are talking about a typical microphone.

753

But their environment, where are they

currently sitting, doesn't really change

754

that fast because they may sit in a bar

and it will be a bar an hour later.

755

So there's no need to infer this

information that often as their speech.

756

So it changes very rarely.

757

So we have a different set of variables

that may change at different scales.

758

And we want also to support this natively

in RxInfer.

759

So we can also make it easier for the

inference engine.

760

So it does not spend computational

resources on variables, which are not

761

updating fast.

762

We want to be able to support

non-parametric models in Rx and FUR.

763

And this includes like Gaussian processes.

764

And we have a research, so currently we

have a PhD student in our lab who is

765

working a lot on that and he has a great

progress.

766

It's not available in the current version

of Rx and FUR, but he has like experiments

767

and it works all nicely.

768

At some point it will be integrated into

the public version.

769

And...

770

Yeah, and it just, you know, just

maintenance and fixing bugs and this kind

771

of stuff, improving the documentation.

772

So the documentation currently needs

improvement because we have quite some

773

features and additions that we have

already integrated into the framework and

774

we happily use them ourselves in our lab

for our research.

775

But it's like maybe poorly documented,

let's say.

776

So other people in theory can use this

functionality, but because they cannot go

777

to my table in the office in the Einhorn

University of Technology, they cannot ask

778

how to use it properly.

779

So we should just put it into

documentation and so other people can use

780

that as well.

781

Yeah, yeah.

782

Yeah.

783

That makes sense.

784

That's a nice roadmap for this year.

785

And looking ahead, what's...

786

your, you know, what's your vision, let's

say, for the future of automated patient

787

inference in the way you do it, especially

in complex models like yours.

788

Yeah, what's your vision about that?

789

What would you like to see in the coming

years?

790

Also, what would you like to not see?

791

A good question.

792

So in my opinion, the future is very

bright.

793

the future of automated vision and like a

lot of great people working on this and

794

start to work on that more people are

coming.

795

Right.

796

So so many toolboxes in Python and Julia,

like I am see cheering Julia in our there

797

are like in C plus 10.

798

So, so many implementations and it's only

getting better every year.

799

Right.

800

But I think in my opinion, the future is

that there will be several applications,

801

like in our case, this autonomous systems

or maybe something else.

802

And this packages, they will basically not

really compete.

803

But in a sense, they will, like for

different applications, you will choose a

804

different solution because all of them

will be kind of great in their own

805

application.

806

But I'm not sure if there will be like a

super ultra cool method that solves all

807

problems of all applications in Bayesian

inference.

808

And maybe we'll have who knows.

809

But in my opinion, there will be always

this trades of trades of in different

810

applications and we'll just have we'll use

different methodologies.

811

Yeah.

812

Yeah, that makes sense in.

813

In a way.

814

I like your point here, but all these

different methods cooperating in a way

815

because they are addressing different

workflows or different use cases.

816

So yeah, definitely I think we'll have

stuff to learn from one type of

817

application to the other.

818

I like this analogy of like, no, we don't

cut the bread with a fork.

819

But it doesn't really make a fork a

useless tool.

820

I mean, we can use a fork for something

else and we are not eating a soup with a

821

knife, but it doesn't make knife a useless

tool.

822

So these are tools that are great, but for

their own purposes.

823

So Alexin Fur is like a good tool for like

real time signal process application.

824

And Turing and Julia is like a great tool

for other applications.

825

So we'll just live together and learn from

each other.

826

Yeah.

827

Fascinating.

828

I really love that.

829

And well, before closing up the show,

because I don't want to take too much time

830

with you, but do you have any question I

really like asking from time to time is if

831

you have any favorite type of model that

you always like to use and you want to

832

share with listeners?

833

You mean probabilistic model?

834

Sure, or it can be a different model for

sure.

835

But yeah, probabilistic model.

836

I actually, yeah, I mentioned a little bit

that I do not really work from application

837

point of view.

838

I really work on the compiler for Bayesian

inference.

839

So I don't really have a favorite model,

let's say.

840

It's hard to say.

841

Yeah, that's interesting because basically

you work, that's always an interesting

842

position to me because you really work on

the, basically making the modeling

843

possible, but usually

844

one of the people using that modeling

platform yourself.

845

Exactly.

846

Yes.

847

Yeah.

848

That's always something really fascinating

to me.

849

Because me, I'm kind of on the bridge, but

a bit more to the applied modeling side of

850

things.

851

So I'm really happy that there are people

like you who make my life easier and even

852

possible.

853

So thank you so much.

854

That's cool.

855

Awesome.

856

Dmitri, that was super cool.

857

Thanks a lot.

858

Before letting you go, though, as usual,

I'm going to ask you the last two

859

questions.

860

I ask every guest at the end of the show.

861

First one, if you had unlimited time and

resources, which problem would you try to

862

solve?

863

Yes, I thought about this question.

864

It's kind of an interesting one.

865

And I thought it would be cool.

866

to have if we have an infinite amount of

time to try to solve some sort of

867

unsolvable paradox because we already have

a limited time.

868

So one of the areas which I never worked

with, but I'm really fascinated about is

869

like astronomy and one of the paradoxes in

astronomy which is kind of

870

I find interesting, but maybe it's not

really a paradox, but anyway, it's like

871

Fermi paradox, which basically in a few

words, it tries to explain the discrepancy

872

between the lack of evidence of other

civilizations, even though apparently

873

there is a high likelihood for its

existence.

874

Right?

875

So this is maybe a problem I would work on

if I would have an infinite amount of

876

resources I can just fly in the space and

try to find them.

877

That sounds like a fun endeavor.

878

Yeah, for sure.

879

I'd love the answer to that paradox.

880

And people are interested in the physics

side of things.

881

There is a whole bunch of physics-related

episodes of this show.

882

So for sure, refer to that.

883

I'll put them in the show notes.

884

My whole playlist about physics episodes.

885

Yeah, I know.

886

And I know also you're a big fan of

Aubrey...

887

Clayton's book, The Bernoulli's Fallacy.

888

So I also put this episode with Aubrey

Clayton in the show notes for people who

889

have missed it.

890

If you have missed it, I really recommend

it.

891

That was a really good episode.

892

No, I know.

893

I know.

894

I know this episode.

895

Yeah, awesome.

896

Well, thanks for listening to the show,

Dimitri.

897

Awesome.

898

Well.

899

Thanks a lot, Mitri.

900

That was really a treat to have you on.

901

I'm really happy because I had so many

questions, but you helped me navigate

902

that.

903

I learned a lot and I'm sure listeners did

too.

904

As usual, I put resources in a link to

your website in the show notes for those

905

who want to dig deeper.

906

Thank you again, Mitri, for taking the

time and being on this show.

907

Yeah, thanks for inviting me.

908

It was a pleasure to talk to you.

909

Really, super nice and super cool

questions.

910

I like it.