Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
In this episode, Dmitry Bagaev discusses his work in Bayesian statistics and the development of RxInfer.jl, a reactive message passing toolbox for Bayesian inference.
Dmitry explains the concept of reactive message passing and its applications in real-time signal processing and autonomous systems. He discusses the challenges and benefits of using RxInfer.jl, including its scalability and efficiency in large probabilistic models.
Dmitry also shares insights into the trade-offs involved in Bayesian inference architecture and the role of variational inference in RxInfer.jl. Additionally, he discusses his startup Lazy Dynamics and its goal of commercializing research in Bayesian inference.
Finally, we also discuss the user-friendliness and trade-offs of different inference methods, the future developments of RxInfer, and the future of automated Bayesian inference.
Coming from a very small town in Russia called Nizhnekamsk, Dmitry currently lives in the Netherlands, where he did his PhD. Before that, he graduated from the Computational Science and Modeling department of Moscow State University.
Beyond that, Dmitry is also a drummer (you’ll see his cool drums if you’re watching on YouTube), and an adept of extreme sports, like skydiving, wakeboarding and skiing!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser and Julio.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉
Takeaways:
– Reactive message passing is a powerful approach to Bayesian inference that allows for real-time updates and adaptivity in probabilistic models.
– RxInfer.jl is a toolbox for reactive message passing in Bayesian inference, designed to be scalable, efficient, and adaptable.
– Julia is a preferred language for RxInfer.jl due to its speed, macros, and multiple dispatch, which enable efficient and flexible implementation.
– Variational inference plays a crucial role in RxInfer.jl, allowing for trade-offs between computational complexity and accuracy in Bayesian inference.
– Lazy Dynamics is a startup focused on commercializing research in Bayesian inference, with the goal of making RxInfer.jl accessible and robust for industry applications.
Links from the show:
- LBS Physics & Astrophysics playlist: https://learnbayesstats.com/physics-astrophysics/
- LBS #51, Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton: https://learnbayesstats.com/episode/51-bernoullis-fallacy-crisis-modern-science-aubrey-clayton/
- Dmitry on GitHub: https://github.com/bvdmitri
- Dmitry on LinkedIn: https://www.linkedin.com/in/bvdmitri/
- RxInfer.jl, Automatic Bayesian Inference through Reactive Message Passing: https://rxinfer.ml/
- Reactive Bayes, Open source software for reactive, efficient and scalable Bayesian inference: https://github.com/ReactiveBayes
- LazyDynamics, Reactive Bayesian AI: https://lazydynamics.com/
- BIASlab, Natural Artificial Intelligence: https://biaslab.github.io/
- Dmitry’s PhD dissertation: https://research.tue.nl/en/publications/reactive-probabilistic-programming-for-scalable-bayesian-inferenc
- Effortless Mastery, by Kenny Werner: https://www.amazon.com/Effortless-Mastery-Liberating-Master-Musician/dp/156224003X
- The Book of Why, by Judea Pearl: https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
- Bernoulli’s Fallacy, by Aubrey Clayton: https://www.amazon.com/Bernoullis-Fallacy-Statistical-Illogic-Science/dp/0231199945
- Software Engineering for Science: https://www.amazon.com/Software-Engineering-Science-Chapman-Computational/dp/1498743854
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you’re willing to correct them.
Transcript
In this episode, Dmitry Bagaev discusses
his work in Bayesian statistics and the
2
development of RxInfer.jl, a reactive
message passing toolbox for Bayesian
3
inference.
4
Dmitry explains the concept of reactive
message passing and its applications in
5
real-time signal processing and autonomous
systems.
6
He discusses the challenges and benefits
of using RxInfer.jl, including
7
its scalability and efficiency in large
probabilistic models.
8
Dimitri also shares insight into the
trade-offs involved in Bayesian inference
9
architecture and the role of variational
inference in rxinfer.jl.
10
Additionally, he discusses his startup
Lazy Dynamics and its goal of
11
commercializing research in Bayesian
inference.
12
Finally, we also discussed the user
friendliness and trade-offs of different
13
inference methods, the future developments
of rxinfer,
14
and the future of automated patient
entrance.
15
Coming from a very small town in Russia
called Nizhny Komsk, Dmitry currently
16
lives in the Netherlands, where he did his
PhD.
17
Before that, he graduated from the
computational science and modeling
18
department of Moscow State University.
19
Beyond that, Dmitry is also a drummer,
you'll see his cool drums if you're
20
watching on YouTube, and an adept of
extreme sports like skydiving,
21
wakeboarding, and skiing.
22
Learning Basin Statistics, episode 100,
recorded January 25, 2024.
23
Dmitry Pagaev, welcome to Learning Basin
Statistics.
24
Thanks.
25
Thanks for inviting me for your great
podcast.
26
Really, I feel very honored.
27
Yeah, thanks a lot.
28
The honor is mine.
29
That's really great to have you on the
show.
30
So many questions for you and yeah, we're
also gonna be able to talk again about
31
Julia, so that's super cool.
32
And I wanna thank of course Albert
Podusenko for putting us in contact.
33
Thanks a lot Albert, it was a great idea.
34
I hope you will love the episode.
35
Well I'm sure you're gonna love Dmitry's
part, and mine is always...
36
more in the air, right?
37
And well, Dmitry, thanks again, because I
know you're a bit sick.
38
So I appreciate it even more.
39
And so let's start by basically defining
what you're doing nowadays, and also how
40
did you end up doing what you're doing
basically?
41
Yes.
42
So I'm currently working at the University
of Technology in bias lab.
43
And I just recently finished my PhD in
Bayesian statistics, essentially.
44
So now I'm just like supervised students.
45
I did some of the projects there and bias
lab itself is a group in the university
46
that primarily work on like a real time
Bayesian signal processing.
47
And we do research in that field.
48
And the slogan, let's say of the lab is
sort of like, is natural artificial
49
intelligence and it's phrased.
50
Uh, like specifically like that, because
there's, there cannot be natural
51
artificial intelligence.
52
So it's like a play words, let's say.
53
Um, and the, the lab is basically trying
to like develop automated, um, control
54
systems or like novel signal processing
applications.
55
And it's basically inspired by, uh,
neuroscience.
56
I know.
57
And we also opened a startup with my
colleagues.
58
which is called Lazy Dynamics.
59
And the idea is basically to commercialize
the research in the lab, but also to find
60
the new funding for new PG students for
the university.
61
But they're still quite young, so we are
still like less than one year, and we are
62
currently like in search of clients and
potential investors.
63
But yeah, my main focus still remains
being a postdoc in the university.
64
Yeah, fascinating.
65
So many things already.
66
Um, maybe what do you do in your postdoc?
67
Um, so my main focus, like primary is, uh,
supporting, uh, the toolbox that we wrote,
68
uh, in our lab that I am a primary author.
69
We call this toolbox, uh, RX and Ferb.
70
Uh, and this is like essential part of my
PhD project.
71
Um, and basically I love to code.
72
So, um, more or less like, uh,
73
my scientific career was always aligned
with software development.
74
And the Erikson FUR project was a really
big project and many other projects in
75
BiasLab, they depend on it.
76
And it requires maintenance, like box
fixing, adding new features, performance
77
improvements.
78
And, and we are currently have several sub
projects that we develop alongside for the
79
Erikson FUR.
80
And that's just like the main focus for
me.
81
And as something else, I also supervise
students for this project.
82
Yeah, yeah.
83
Of course.
84
That must also take quite some time,
right?
85
Yes, exactly.
86
Yeah.
87
Yeah, super cool.
88
So let me start basically by diving a bit
more into the concepts you've just named,
89
because you've already talked about a lot
of the things you work on, which is.
90
a lot, as I guess listeners can hear.
91
So first, let's try and explain the
concept of reactive message passing in the
92
context of Bayesian inference for
listeners who may not be familiar with it,
93
because I believe it's the first time we
really talk about that on the show.
94
So yeah, talk to us about that.
95
Also, because from what I understand, it's
really the main focus of your work, be it
96
through RxInfR.
97
infer.jl or lazy dynamics or biaslam.
98
So let's start by having the landscape
here about reactive message passing.
99
Yes, good.
100
So yeah, ARIKS and FER is what we call
reactive message passing based Bayesian
101
inference toolbox.
102
And basically in the context of Bayesian
inference, we usually work with
103
probabilistic models.
104
And the probabilistic model is usually a
function of some variables and some
105
variables are being observed.
106
And we want to infer some probability
distribution over unobserved variables.
107
And what is interesting about that is that
if we have a probabilistic model, we can
108
actually represent it as a graph.
109
And for example, if we can factorize our
probabilistic model into a set of factors,
110
such that each node will be a factor and
each edge will be a variable of the model,
111
more like hidden state, and some of them
are observed or not.
112
And basically message passing by itself is
a very interesting idea of solving Bayes
113
rule for a probabilistic model defined in
terms of the graph.
114
So it does it by sending messages between
nodes in the graph, along edges.
115
And it's quite a very big topic actually.
116
But essentially here to understand is that
we can do that, right?
117
So we can reframe the base rule as
something that has this messages in the
118
ground, uh, reactive message passing, uh,
is a particular implementation, uh, of
119
this idea.
120
So, because in the traditional message
passing, we usually have to define an
121
order of messages, like how, in what order
do we compute them?
122
It may be very crucial, for example, if
the graph structure has loops.
123
So there is like some structural
dependencies in the graph and reactive
124
message passing basically says, okay, no,
we will not do that.
125
We will not specify any order.
126
Instead we will react on data.
127
So, and, uh, the, the order of message
computations, uh, becomes essentially data
128
driven and we do not enforce any
particular, uh,
129
order of competition.
130
OK, so if I try to summarize, that would
be something like, usually when you work
131
on a Bayesian model, you have to specify
the graph and the order of the graph in
132
which direction the nodes are going.
133
In reactive message passing, it's more
like a non-parametric version in a way
134
where you just say, there are these stuff,
but you're not specifying the.
135
the directions and you're just trying to
infer that through the data.
136
How wrong is that characterization?
137
Not exactly like that.
138
So indeed the graph that we work with,
they don't have any direction in them,
139
right?
140
Because messages, they can flow in any
direction.
141
The main difference here is that reactive
message passing reacts on changes in data
142
and updates posteriors automatically.
143
Right?
144
So.
145
There is no particular order in which we
update the series.
146
For example, if we have some variables in
our mode, like ABC, we don't know which
147
will be updated first and which will be
the last.
148
It basically depends on our observations.
149
Uh, but, uh, it works like that, that as
soon as we have new observation, uh, the
150
graph reacts in this observation and
updates the series as soon as it can.
151
without explicitly specifying this order.
152
And why would you do that?
153
Why would that be useful?
154
So it's a very good question.
155
So because in BiasLab, we essentially work
with, we try to work with autonomous
156
systems.
157
And autonomous systems, they have to work
in the field, right?
158
So like in the real world environment,
let's say, right?
159
And
160
Real world environment is extremely
unpredictable.
161
If we want to, to be more clear, let's say
we try to develop a drone, which tries to
162
navigate the environment and it has like
several sensors and we want to build a
163
probabilistic model of the environment,
such that drones wants to act in this
164
environment and like in sensors, it has
some noise in it.
165
Like, uh, so essentially.
166
We cannot predict in what order the data
will be arriving, right?
167
Because you may have a video signal, you
may have an audio signal and this, um,
168
devices that record video, let's say they
also have unpredictable update rate.
169
Usually it's maybe like 60 frames per
second, but it may change.
170
Right.
171
Um, so instead of like fixing the
algorithm and saying, okay, we wait for
172
like new frame.
173
from a video, wait for a new frame from an
audio, then we update, then we wait again.
174
Instead of doing that, we just simply let
the system react on new changes and update
175
the series as soon as possible.
176
And then based on new posteriors, we act
as soon as possible.
177
This is kind of the main idea of reactive
implementations.
178
And in traditional software,
179
for Bayesian inference, for example, we
just have a model, and we have a data set,
180
and we feed the data set to the model, and
we have the posterior, and then we analyze
181
the posterior, and it also works really
great, right?
182
But it doesn't really work in the field
where you don't have time to synchronize
183
your data set and to react as soon as you
can.
184
Okay, okay, I see.
185
So that's where, basically,
186
This kind of reactive message passing is
extremely useful when you receive data in
187
real time that you don't really know the
structure of.
188
Yes, we work primarily with real-time
signals.
189
Yes.
190
Okay, very interesting.
191
Actually, do you have any examples, any
real-life examples that you've worked on
192
or...
193
You know, this is extremely useful to work
on with RxInfoR.jl or just in general,
194
these kind of relative messages passing.
195
Yes.
196
So I myself, I usually do not work with
applications.
197
So my primary focus lies in the actual
Bayesian inference engine.
198
But in our lab, there are people who work,
for example, on audio signals.
199
Right.
200
So you want to you want, for example,
maybe create a probabilistic model of
201
environment to be able to denoise speech
or it or it may be like a position
202
tracking system or a planning system in
real time.
203
In our lab, we also very often refer to
the term active inference.
204
which basically defines a probabilistic
model, not only of your environment, but
205
also of your actions, such that you can
infer the most optimal course of actions.
206
And this might be useful in control
applications, also for the drone, right?
207
So we want to infer not only the position
of the drone based on sensors that we
208
have, but also how it should act to avoid
an obstacle, for example.
209
I see.
210
Yeah, OK, super interesting.
211
So basically, any case where you have
really high uncertainty, right, that kind
212
of stuff, OK, yes, super interesting.
213
And so what prompted you to create a tool
for that?
214
What inspired you to develop our existing
Forto.jl?
215
And maybe also tell us how it differs from
traditional Bayesian inference tools.
216
be it in Python or in R or even in Julia.
217
If I'm a Julia user, I'm used to use
probabilistic programming language in
218
Julia, then what's the difference with
RxInfoR?
219
This is a good question.
220
But there are two questions in one about
inspiration.
221
So I joined the bias lab in 2019.
222
without really understanding what it is
going to be about.
223
So, but really understanding how difficult
it is really.
224
So, and the inspiration for me came from
the project that I started my PhD on.
225
And basically the main inspiration in our
lab is like the so-called the free energy
226
principle, which kind of tries to explain.
227
how natural biotic systems behave.
228
Right.
229
So, and they basically say they define
so-called Bayesian brain portesies and
230
pre-energy principles.
231
So they basically say that any biotic
system, they define a probabilistic model
232
of its environment and tries to infer the
most optimal course of action to survive
233
essentially.
234
But all of this is based on Bayesian
inference as well.
235
So, right.
236
At the end.
237
It kind of, it's a very good idea, but at
the end, it all boils down to the, to the
238
Bayesian inference.
239
And basically if you look how biotech
system work, we, we note that there are
240
very specific properties of this biotech
system.
241
So they do not consume a lot of power.
242
Right.
243
It's actually, it has been proven that our
brain consumes like about 20 Watts of
244
energy, right.
245
And it's like an ex.
246
extremely efficient device, if we can say,
right?
247
It does not even compare with
supercomputers.
248
It's also scalable because we live in the
very complex environment with many
249
variables.
250
We act in real time, right?
251
And we are able to adapt to the
environment.
252
And we are also kind of robust to what is
happening around us, right?
253
So...
254
If something new happens, we were able to
adapt to it instead of just failing.
255
Right.
256
And this is kind of the idea.
257
So the inspiration for this Bayesian
inference toolbox that we need to be
258
scalable, real time, adaptive, robust,
super efficient, and also low power.
259
Right.
260
So this is the main ideas behind RX
Inferior project.
261
And here we go to the second part of the
question.
262
How does it differ?
263
Because this is exactly where we differ,
right?
264
So other solutions in Python or in Julia,
also very cool.
265
There are actually a lot of cool libraries
for Bayesian inference, but most of them,
266
they have a different set of trades off or
requirements.
267
And maybe I will be super clear.
268
We are not trying to be better.
269
But we are trying to have a different set
of requirements for the Bayesian different
270
system.
271
Yeah.
272
Yeah, you're working on a different set of
needs, in a way.
273
Yes, yes.
274
And it's application-driven.
275
Yeah, you're trying to address another
type of applications.
276
Exactly.
277
And if we directly compare to other
solutions, they are mostly based on
278
sampling, like HMC or not.
279
Or maybe they are like black box methods
like a GVI, automatic differential
280
variation inference or VDI.
281
And they basically, they are great methods
that they tend to consume a lot of
282
computational power or like energy, right?
283
So they do a very expensive simulation.
284
It may run for maybe hours, maybe even
days in some situations.
285
And they were great, but you cannot really
apply it in this autonomous systems where
286
you need to...
287
Uh, like if we're again talking about
audio, it's like 44 kilohertz.
288
So we need to really perform Bayesian
inference and extremely fast scale.
289
And it seems you're not, uh, are not
really applicable in this situation.
290
So.
291
Yeah, fascinating.
292
And you were talking, well, we'll get back
to the computation part a bit later.
293
Maybe first I'd like to ask you, why did
you do it with Julia?
294
Why did you choose Julia for RxInfer?
295
And what advantages does it offer for your
applications of patient inference?
296
The particular choice of Julia was
actually driven by the needs of the bias
297
lab in the university because all the
research which we do in the university now
298
in our lab is done in Julia and that
decision has been made by our professor
299
many, many years ago.
300
Interestingly enough, our professor
doesn't really code.
301
But Julia is a really great language.
302
So if I would choose myself.
303
If I, I would still choose Julia.
304
It's, it's, it's a great language.
305
It's fast.
306
Right.
307
So, and our primary concern is efficiency.
308
Um, and like Python can also be fast.
309
Uh, if you like know how to use it, if you
use an MP or like some specialized
310
libraries, uh, but with July, it's, it's
really easy.
311
It is easier.
312
In some situations, of course, you need to
know a bit more.
313
So my background is in C and C++.
314
And I understand like how compilers works,
for example.
315
So maybe for me, it's a bit easier to
write a performance Julia code.
316
But in general, it's just, it's just
really, it's a nice, fast language.
317
And it also develops fast in the sense
that new versions of Julia, they,
318
come up like every several months.
319
And it really gets better with each
release.
320
Another thing which is actually very
important for us as well is macros.
321
Are macros in Julia?
322
So for people who are listening, so macros
basically allow us to apply arbitrary code
323
transformations to the existing code.
324
And it also allows you to create
sublanguage within a language.
325
And why it is particularly useful for us
is that specifying probabilistic models in
326
Bayesian inference is a bit hard or
tedious.
327
We don't want to directly specify these
huge graphs.
328
And instead, what we did and what Turing
also did and many other libraries in
329
Julia, they came up with the main specific
language for specifying probabilistic
330
programs.
331
And it's extremely cool.
332
So it's much, much simpler to define a
probabilistic program in Julia than in
333
Python, in my opinion.
334
And I really like this feature of Julia.
335
Yeah, these basically building block
aspect of the Julia language.
336
Yeah, yeah, I've heard that.
337
There are other aspects I can mention of
Julia.
338
By the way, maybe I also can make an
announcement regarding Julia is that the
339
next Julia the con is happening in I'm
told in the city where I'm currently in.
340
And it's going to be very cool.
341
It's going to be in PC stadium in the
football stadium.
342
Right.
343
The technical is the technical conference
about programming language is going to be
344
on the stadium.
345
So, but so another aspect.
346
about Julia is this notorious dynamic
multiple dispatch.
347
And it was extremely useful for us in
particular for reactive message passing
348
implementation.
349
Because again, so if we think about how
this reactiveness work and how do we
350
compute these messages on the graph, in
order to compute the message, we wait for
351
inputs.
352
And then when all inputs have arrived, we
have to decide
353
how to compute the message.
354
And computation of the message is
essentially solving an integral.
355
But if we know types of the arguments, and
if we know the type of the node, it might
356
be that there is an analytical solution to
the message.
357
So it's not really necessary to solve a
complex integral.
358
And we do it by multiple dispatch in
Julia.
359
So multiple dispatch in Julia helps us to
pick the most efficient message update
360
rule.
361
on the graph, and it's basically built
into the language.
362
It's also possible to emulate it in
Python, but in Julia, it's just fast and
363
built-in, and it works super nice.
364
No idea.
365
Yeah, super cool.
366
Yeah, for sure.
367
Super interesting points.
368
And I'm very happy because it's been a
long time since we've had a show with some
369
Julia practitioners, so that's always very
interesting to hear of what's going on in
370
that.
371
in that field and yeah, I would be
convinced just by coming to PSV Eindhoven
372
Stadium.
373
You don't have to tell me more.
374
I'll be there.
375
Let's do a live show in the stadium.
376
Yes, I will be there.
377
Yeah.
378
Yeah, that sounds like a lot of fun.
379
And actually, so I'm myself an open source
developer, so I'm very biased to ask you
380
that question.
381
What were some of the biggest challenges
you faced when you developed RxInfer?
382
And how did you overcome them?
383
I guess that's like the main thing you do
when you're an open source developer is
384
putting a tire.
385
This is an amazing question.
386
I really like it.
387
So, and I even have like some of the
answers in my PhD dissertation.
388
And I will probably just go ahead.
389
I'll probably just quote it, but I don't
remember exactly how I framed it.
390
But I took it from the book, which is
called, um, uh, software engineering for
391
science.
392
So, and basically it says that people
usually underestimate how difficult it is
393
to create, um, a software in scientific
research area.
394
Uh, and the main difficulty with that is
that there are no clear guidelines to
395
follow.
396
Uh, it's not like designing a website with
clear, like a framework rules and you just
397
need tasks between like people and team.
398
No, it's like, um, new insights of
science, like, or like an area where we
399
work in that they happen every day.
400
Right.
401
And the requirements for the software,
they may change every day.
402
Uh, and it's really hard to like come up
with a specific design before we start
403
developing because
404
requirements change over time because you
may create some software for research
405
purposes and then you found out something
super cool which works better or faster or
406
scales better and then you realize that
well you actually have to start over
407
because this is just better we just we
just found out something cooler and
408
It also means that a developer must invest
time into this research.
409
So it's not only about coding, like you
should understand how it all works from
410
the scientific point of view, from a
mathematical point of view.
411
And sometimes if this is like a cutting
edge research, there are no books about
412
how it works, right?
413
So we must invest time in reading papers.
414
Um, and also being able to write a good
code, which is fast and efficient.
415
Uh, and all of these problems, they, they
also cured, uh, when we developed our
416
extinfer, uh, even though I'm the main
author, uh, a lot of people have helped
417
me, right?
418
It's like, uh, very thankful for that.
419
Uh, and for our extinfer in particular,
for my, I also needed to learn a very big
420
part of statistics because when I joined
the lab,
421
I actually didn't have a lot of experience
with Bayesian inference and with graphs
422
and with message passing.
423
So I really need to dive into this field.
424
And many people helped me to understand
how it works.
425
A lot of my colleagues, they have spent
their time explaining.
426
And even though, right, so we have already
this stack of difficulties at the end or
427
like maybe not at the end, but the
software that we use, we would like it to
428
be.
429
Easy to use, like, or user friendly.
430
So we already have this difficulties about
we don't know how to design it.
431
We have to invest time into reading
papers.
432
But then we at the end, we want to have a
functional software that is easy to use,
433
addresses different needs and allows you
to find new insights.
434
So the software should be designed such
that it does not.
435
impose a lot of constraints on what you
can do with this software, right?
436
Because scientific software is about
finding new insights, not about like doing
437
some predefined set of algorithm.
438
You want to find something new
essentially.
439
And software should help you with that.
440
Yeah, yeah, for sure.
441
That's a good point.
442
What do you think, what would you say are
the key challenges in achieving
443
scalability and efficiency in this
endeavor and how does RxInfair address
444
this?
445
Basically, we are talking in the context
of Bayesian inference and the key
446
challenge in
447
the base rule doesn't scale, right?
448
It's, the formula looks very simple, but
in practice, then we start working with
449
large probabilistic models.
450
Just blind application of base rule
doesn't scale because it has exponential
451
complexity with respect to the number of
variables.
452
And Arikson-Ford tries to tackle this
by...
453
having essentially two main components in
the recipe, like maybe three, let's say
454
three.
455
So first of all, we use factor graphs to
specify the model.
456
So we work with factorized models.
457
We work with message passing, and message
passing essentially converts the
458
exponential complexity of the Bayes rule
to linear, but only for highly factorized
459
models.
460
And like highly factorized here is a
really crucial component, but many models
461
are indeed highly factorized.
462
It's it means that Variables do not
directly depend on all other variables.
463
They directly depend on maybe a very small
subset of variables in the model.
464
And the third component here is
variational inference.
465
So because it allows us to trade off the
computational complexity with accuracy.
466
So if the task is too difficult or it
doesn't scale, basically what variational
467
inference gives you is the ability to
impose a set of constraints into your
468
problem, because it reframes the original
problem as an optimization task.
469
And we can optimize with up to a certain
constraint.
470
For example, we may say that this variable
is distributed as a Gaussian distribution.
471
It may not be true in reality and we lose
some accuracy, but at the end it allows us
472
to solve some equations faster.
473
And we can impose more and more
constraints if we don't have enough
474
computational power and if you have large
model, or we may relax constraints if we
475
have enough computational power and we
gain accuracy.
476
So we have this sort of a slider.
477
which allows us to scale better.
478
But here's the thing, right?
479
We always can come up with such a large
model with so many variables and so
480
difficult relationships between variables
where it still will not scale.
481
And this is fine.
482
But Alexin Fur tries to push this boundary
for like scaling Bayesian inference to
483
large models.
484
And actually, so you're using variational
inference quite a lot in this endeavor,
485
right?
486
So actually, can you discuss the role of
variational inference here in RxInfer and
487
maybe any innovations that you've
incorporated in this area?
488
So the role I kind of touched upon a
little bit is that it acts as like a
489
slider.
490
Right.
491
In in the controlling the complexity and
the accuracy of your inference result.
492
This is the main role.
493
Of course, for some applications, this
might be undesirable.
494
For some applications, you may want to
have a perfect posterior estimation.
495
But for some applications, it's not a very
big deal.
496
Again, we are talking about different
needs for different application.
497
here.
498
And the innovation that RX and Fer brings,
I think it's like one of the few
499
implementation as message passing, like
variational inference as message passing,
500
because it's usually implemented as like
black box method that takes a function
501
like a probabilistic model function and
maybe does some automatic differentiation
502
or some extra sampling under the hood.
503
And message passing by itself has a very
long history, but I think people
504
mistakenly think that it's quite limited
to like some product algorithm.
505
But actually, variational inference can
also be implemented as message passing.
506
And it's quite good.
507
So it opens the applicability of the
message passing algorithms.
508
And also.
509
As we already talked a little bit about
this reactive nature of the inference
510
procedure, so it's also maybe even the
first reactive variational inference
511
engine, which is designed to work with
infinite data streams.
512
So it continuously updates this posterior
continuously does minimization.
513
It does not stop.
514
And as soon as new data arrive, we
basically update our posteriors.
515
But in between this kind of data windows,
we can spend more computational resources
516
to find better approximation for the
variational inference.
517
But yeah, but all other solutions, let's
say that are also variational inference,
518
they basically require you to, yeah.
519
to wait for the data, then feed to the
data, or wait for the entire data set,
520
feed the data set, and then you have the
result, then you analyze the result, and
521
then you repeat.
522
So RxInfoR works a bit differently in that
regard.
523
Yeah.
524
Fascinating.
525
And that, I'm guessing you have some
examples of that up in the RxInfoR
526
website, maybe we can...
527
a link to that in the shows for people who
are interested to see how you would apply
528
that in practice?
529
So I,
530
So it does not really require reactivity,
but because it's kind of like easy to use
531
and fast, students can do some homework
for signal processing applications.
532
What I already mentioned is that we work
with audio signals and with control
533
applications.
534
I don't really have a particular example
if our sensor is being used in the field.
535
or by an industry.
536
So it's primarily our research tool
currently, but we want to extend it.
537
So it's still a bit more difficult to use
than Turing, let's say.
538
Turing, which is also written in Julia,
because yeah, message passing is a bit
539
maybe more difficult to use and it is not
that universal as HMC and NUTS still
540
require some approximation methods.
541
Yeah.
542
So we still use it as a research tool
currently, but we have some ideas in the
543
lab, how to expand the available set of
probabilistic models we can run an
544
inference on.
545
And yes, indeed, on our documentation, we
have quite a lot of examples where we can
546
use, but these examples, they are, I would
say, educational in most of the cases.
547
at least in the documentation.
548
So we are at this stage where we have a
lot of ideas how we can improve the
549
inference, how we make it faster, such
that we can actually apply it for real
550
tasks, like for real drones, for real
robots, to make a real speech, like the
551
noise or something similar.
552
Yeah, definitely said.
553
That would be super interesting, I'm
guessing, for people who are into these
554
and also just want to check out.
555
I have been checking out your website
recently to prepare for the episode.
556
Actually, can you now...
557
So you've shared some, like the overview
of the theory, how that works, what
558
RxInfer does in that regard.
559
Can you share what you folks are doing
with Lazy Dynamics, how that's related to
560
that?
561
How does that fit into this ecosystem?
562
So yeah, Lazy Dynamics, we created this
company to commercialize the research that
563
we do at our lab to basically find funding
to make our extrovert better and ready for
564
industry.
565
Because currently, let's say,
566
Ericsson is a great research tool for our
purposes, right?
567
But industry needs some more properties to
the addition that I have already
568
mentioned.
569
Right?
570
For example, indeed the Bayesian inference
engine must be extremely robust, right?
571
It does not allow to fail if we really
work in the field.
572
And this is not really a research
question.
573
It's more about like implementational
side.
574
Right.
575
It's like a good goal to good code
coverage, like great documentation.
576
And this is what we kind of also want to
do with lazy dynamics.
577
We want to take this next step and want to
create a great product for other
578
companies, especially that can rely on Rx
and Fur in the maybe in their research or
579
maybe even in the field.
580
Right.
581
And maybe we create some sort of a tools,
a tool set around RxInfer that will allow
582
you to maybe debug the performance of your
probabilistic problem or your
583
probabilistic inference, right?
584
It's also not about research.
585
It's about like having it more accessible
to other people, like finding bugs or
586
mistakes in their model specification,
make it easier to use.
587
Or maybe, for example, we could.
588
come up with some sort of a library of
models, right?
589
So you would want to build some autonomous
system and it may require a model for
590
audio recognition, it may require a model
for video recognition.
591
And this kind of set of models, they can
be predefined, very well tested, have a
592
great performance, super robust.
593
And basically Lazy Dynamics may provide an
access to this kind of a library.
594
right?
595
So, and for this kind of, because this is
not a research related questions, it's, it
596
must be done in a company with like a very
good programmers and very good code
597
coverage and documentation.
598
But for research purposes, Ericsson-Fer is
already a great toolbox.
599
And basically many students in our lab,
they already use it.
600
But.
601
Yeah, because we are all sitting in the
same room, let's say on the same floor, we
602
can kind of brainstorm, find bugs, fix it
on the fly and they keep working that.
603
But if we want Rx, for Rxinfer to be used
in industry, it really needs to be a
604
professional toolbox with like a
professional support.
605
Yeah.
606
Yeah, I understand that makes sense.
607
Surprised you can, I don't know when you
sleep though, between the postdoc, the
608
open source project and the company.
609
So yeah, it's a great comment, but yeah,
it's hard.
610
Yeah, hopefully we'll get you some sleep
in the coming months.
611
To get back to your PhD project, because I
found that very interesting.
612
So your dissertation will be in the show
notes.
613
But something I was also curious about is
that in this PhD project, you explore
614
different trade-offs for Bayesian
inference architecture.
615
And you've mentioned that a bit already a
bit earlier, but I'm really curious about
616
that.
617
So could you elaborate on these trade-offs
and why they are significant?
618
Yes, we already touched a little bit about
that.
619
So the main trade-offs here are kind of
computational load, efficiency,
620
adaptivity, high power consumption, magic.
621
Yeah.
622
And another aspect actually, which we
didn't talk about yet is structural model
623
adaptation.
624
So this is the requirements that we are
favor.
625
in the Ricks Center.
626
And this has the requirements that were
like central to my PhD project.
627
And this all arises, all of these
properties, they are not just coming from
628
a vacuum.
629
They are coming from real time signal
processing applications on autonomous
630
systems.
631
We don't have a lot of battery power.
632
We don't have a very powerful CPUs on this
autonomous devices, because essentially
633
what we want to do also is that
634
We want to be able to run a very
difficult, large probabilistic models on
635
the Raspberry Pi.
636
And Raspberry Pi doesn't even have a GPU.
637
So we can buy some small sort of a GPU and
put it on the Raspberry Pi.
638
But still, the computational capabilities
are very, very limited on edge devices.
639
For example, one may say, let's just do
everything in the cloud, which is a very
640
valid argument, actually.
641
But we also, in some situations, the
latencies are just too big.
642
And also, maybe we don't have access to
the internet in some areas, but we still
643
want to create these adaptive Bayesian
inference systems like a drone that they
644
may...
645
explore some area maybe in the mountain or
something where we don't really have an
646
internet so we cannot really process
anything in the cloud.
647
So it must work as efficient as possible.
648
On a very, very small device that doesn't
have a lot of power doesn't have a lot of
649
battery and still this should work in real
time.
650
Yeah, I think, I think this is mostly the
main trades of and
651
In terms of how we do it, we use this
variational inference and we sacrifice
652
accuracy with respect to scalability.
653
Reactive message passing allows us to
scale to a very large models because it
654
works on Factor graphs.
655
Yeah.
656
And I think that's, these are very
important points to make, right?
657
Because always when you work and you build
an open source,
658
package you have to trade off to make.
659
So that means you have to choose whether
you're going to a general package or a
660
more specified one.
661
And that will dictate in a way your trade
off.
662
In RxInfer, it seems like you're quite
specified, specialist of message passing
663
inference.
664
So the cool thing here is that I'm
665
choices because you're like, no, our main
use case is that.
666
And so we can use that.
667
And the devirational inference choice, for
instance, is quite telling because in your
668
case, it seems to be really working well,
whereas we could not do that in PMC, for
669
instance.
670
If we remove the ability to use HMC, we
would have quite a drop in the user
671
numbers.
672
So yeah, that's always something I'm.
673
Try to make people aware of when they are
using open source packages.
674
You can do everything.
675
Yeah, exactly.
676
Exactly.
677
So I actually really, when I have a need,
I really enjoy working with like HMC or
678
NAT based methods because they just work,
just like magic.
679
And, but, and here's the trade off, right?
680
They work magically in many situations.
681
But they're slow in some sense.
682
Let's say they're not slow, but they're
slower than a message button.
683
So here is this trade-off.
684
So user friendliness is really, really
important key in this equation.
685
Yeah, and what do you call user
friendliness in your case?
686
So what I refer to user friendliness here
is that a user can specify a model, press
687
a button with HMC and it just runs and the
user gets a result.
688
Yes, a user needs to wait a little bit
more.
689
But anyway, like user experience is great.
690
Just specify a model, just run inference,
just get your result.
691
With RxInfer, it's a bit less easier
because in most of the cases,
692
uh, message passing works like that, that
it favors like analytical solutions on the
693
graph.
694
And if analytical solution for a message
is not available, uh, basically a user
695
must specify an approximation method.
696
Uh, it actually also can be HMC, uh, just
in case.
697
Uh, but still our X and four does not
really specify a default approximation
698
method.
699
Currently,
700
fine default approximation, but because it
does not define it currently, if a user
701
specifies a complex probabilistic model,
it will probably throw an error saying
702
that, okay, I don't know how to solve it,
please specify what should I do here and
703
there.
704
And for a new user, it might be a bit
unintuitive how to do that, what to
705
specify.
706
So for HMC, there's no need to do it, it
just works.
707
But if RxInfer, it's not that easy yet.
708
That's what I was referring to,
user-friendliness.
709
Yeah, that makes sense.
710
And again, here, the interesting thing is
that the definition of user-friendliness
711
is going to depend on what you're trying
to optimize, right?
712
What kind of use case you're trying to
optimize on.
713
Yes.
714
Actually, what's the future for RxInfer?
715
What are the future developments or
enhancements that you are planning?
716
So, we have already touched a little bit
about like Lazy Dynamics side, which tries
717
to make a really, like a commercial
product out of the person, where we have
718
great support.
719
This is one side of the future, but we
also have a research side of the project.
720
And research side of the project includes
structural model adaptation.
721
We just, uh, which in my opinion is quite
cool.
722
So what it basically means in a few words
is that we, we may be able in the future
723
to change the structure of the model on
the fly without stopping the inference
724
procedure and you may need it for several
reasons, for example, uh, computational
725
power, uh, computational budget change,
and we are not longer able, we are no
726
longer able to run inference.
727
on such a complex model.
728
So we want to reduce the complexity of the
model.
729
We want to change the structure, maybe put
some less demanding factor nodes.
730
And we want to do it on the fly.
731
We want actually stopping the inference
because for like sampling based methods,
732
if we change the model, we basically are
forced to restart because we have this
733
change and it's quite difficult to reduce
the previous result if the structure of
734
the model change.
735
graphs, it's actually possible.
736
So another point why we would need that in
the field is that if you could imagine
737
different sensors, so we have different
observations, and one sensor all of a
738
sudden just burned out, or glitched, or
something like that.
739
So essentially, we are not longer having
this sort of observation.
740
So we need to change the structure of our
model
741
to account for this glitch or breakage of
the sensor.
742
And this is also where reactive message
passing helps us because we basically,
743
because we do not enforce the particular
order of updates, we stop reacting on this
744
observation because it's no longer
available.
745
And we also change the structure of the
model to account for that.
746
Another thing for the future of RxN4 in
terms of research is that we want to be
747
to support natively different update rates
for different variables.
748
And so what I mean by that is that if you
imagine an audio recognition system, let's
749
say, or audio enhancement system, let's
say, and you have you modeled the
750
environment of like a person who is
talking around several persons and let's
751
say their speech signal.
752
arise at the rate of like 44 kilohertz if
we are talking about a typical microphone.
753
But their environment, where are they
currently sitting, doesn't really change
754
that fast because they may sit in a bar
and it will be a bar an hour later.
755
So there's no need to infer this
information that often as their speech.
756
So it changes very rarely.
757
So we have a different set of variables
that may change at different scales.
758
And we want also to support this natively
in RxInfer.
759
So we can also make it easier for the
inference engine.
760
So it does not spend computational
resources on variables, which are not
761
updating fast.
762
We want to be able to support
non-parametric models in Rx and FUR.
763
And this includes like Gaussian processes.
764
And we have a research, so currently we
have a PhD student in our lab who is
765
working a lot on that and he has a great
progress.
766
It's not available in the current version
of Rx and FUR, but he has like experiments
767
and it works all nicely.
768
At some point it will be integrated into
the public version.
769
And...
770
Yeah, and it just, you know, just
maintenance and fixing bugs and this kind
771
of stuff, improving the documentation.
772
So the documentation currently needs
improvement because we have quite some
773
features and additions that we have
already integrated into the framework and
774
we happily use them ourselves in our lab
for our research.
775
But it's like maybe poorly documented,
let's say.
776
So other people in theory can use this
functionality, but because they cannot go
777
to my table in the office in the Einhorn
University of Technology, they cannot ask
778
how to use it properly.
779
So we should just put it into
documentation and so other people can use
780
that as well.
781
Yeah, yeah.
782
Yeah.
783
That makes sense.
784
That's a nice roadmap for this year.
785
And looking ahead, what's...
786
your, you know, what's your vision, let's
say, for the future of automated patient
787
inference in the way you do it, especially
in complex models like yours.
788
Yeah, what's your vision about that?
789
What would you like to see in the coming
years?
790
Also, what would you like to not see?
791
A good question.
792
So in my opinion, the future is very
bright.
793
the future of automated vision and like a
lot of great people working on this and
794
start to work on that more people are
coming.
795
Right.
796
So so many toolboxes in Python and Julia,
like I am see cheering Julia in our there
797
are like in C plus 10.
798
So, so many implementations and it's only
getting better every year.
799
Right.
800
But I think in my opinion, the future is
that there will be several applications,
801
like in our case, this autonomous systems
or maybe something else.
802
And this packages, they will basically not
really compete.
803
But in a sense, they will, like for
different applications, you will choose a
804
different solution because all of them
will be kind of great in their own
805
application.
806
But I'm not sure if there will be like a
super ultra cool method that solves all
807
problems of all applications in Bayesian
inference.
808
And maybe we'll have who knows.
809
But in my opinion, there will be always
this trades of trades of in different
810
applications and we'll just have we'll use
different methodologies.
811
Yeah.
812
Yeah, that makes sense in.
813
In a way.
814
I like your point here, but all these
different methods cooperating in a way
815
because they are addressing different
workflows or different use cases.
816
So yeah, definitely I think we'll have
stuff to learn from one type of
817
application to the other.
818
I like this analogy of like, no, we don't
cut the bread with a fork.
819
But it doesn't really make a fork a
useless tool.
820
I mean, we can use a fork for something
else and we are not eating a soup with a
821
knife, but it doesn't make knife a useless
tool.
822
So these are tools that are great, but for
their own purposes.
823
So Alexin Fur is like a good tool for like
real time signal process application.
824
And Turing and Julia is like a great tool
for other applications.
825
So we'll just live together and learn from
each other.
826
Yeah.
827
Fascinating.
828
I really love that.
829
And well, before closing up the show,
because I don't want to take too much time
830
with you, but do you have any question I
really like asking from time to time is if
831
you have any favorite type of model that
you always like to use and you want to
832
share with listeners?
833
You mean probabilistic model?
834
Sure, or it can be a different model for
sure.
835
But yeah, probabilistic model.
836
I actually, yeah, I mentioned a little bit
that I do not really work from application
837
point of view.
838
I really work on the compiler for Bayesian
inference.
839
So I don't really have a favorite model,
let's say.
840
It's hard to say.
841
Yeah, that's interesting because basically
you work, that's always an interesting
842
position to me because you really work on
the, basically making the modeling
843
possible, but usually
844
one of the people using that modeling
platform yourself.
845
Exactly.
846
Yes.
847
Yeah.
848
That's always something really fascinating
to me.
849
Because me, I'm kind of on the bridge, but
a bit more to the applied modeling side of
850
things.
851
So I'm really happy that there are people
like you who make my life easier and even
852
possible.
853
So thank you so much.
854
That's cool.
855
Awesome.
856
Dmitri, that was super cool.
857
Thanks a lot.
858
Before letting you go, though, as usual,
I'm going to ask you the last two
859
questions.
860
I ask every guest at the end of the show.
861
First one, if you had unlimited time and
resources, which problem would you try to
862
solve?
863
Yes, I thought about this question.
864
It's kind of an interesting one.
865
And I thought it would be cool.
866
to have if we have an infinite amount of
time to try to solve some sort of
867
unsolvable paradox because we already have
a limited time.
868
So one of the areas which I never worked
with, but I'm really fascinated about is
869
like astronomy and one of the paradoxes in
astronomy which is kind of
870
I find interesting, but maybe it's not
really a paradox, but anyway, it's like
871
Fermi paradox, which basically in a few
words, it tries to explain the discrepancy
872
between the lack of evidence of other
civilizations, even though apparently
873
there is a high likelihood for its
existence.
874
Right?
875
So this is maybe a problem I would work on
if I would have an infinite amount of
876
resources I can just fly in the space and
try to find them.
877
That sounds like a fun endeavor.
878
Yeah, for sure.
879
I'd love the answer to that paradox.
880
And people are interested in the physics
side of things.
881
There is a whole bunch of physics-related
episodes of this show.
882
So for sure, refer to that.
883
I'll put them in the show notes.
884
My whole playlist about physics episodes.
885
Yeah, I know.
886
And I know also you're a big fan of
Aubrey...
887
Clayton's book, The Bernoulli's Fallacy.
888
So I also put this episode with Aubrey
Clayton in the show notes for people who
889
have missed it.
890
If you have missed it, I really recommend
it.
891
That was a really good episode.
892
No, I know.
893
I know.
894
I know this episode.
895
Yeah, awesome.
896
Well, thanks for listening to the show,
Dimitri.
897
Awesome.
898
Well.
899
Thanks a lot, Mitri.
900
That was really a treat to have you on.
901
I'm really happy because I had so many
questions, but you helped me navigate
902
that.
903
I learned a lot and I'm sure listeners did
too.
904
As usual, I put resources in a link to
your website in the show notes for those
905
who want to dig deeper.
906
Thank you again, Mitri, for taking the
time and being on this show.
907
Yeah, thanks for inviting me.
908
It was a pleasure to talk to you.
909
Really, super nice and super cool
questions.
910
I like it.