Learning Bayesian Statistics

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

This is another installment in our neuroscience modeling series! This time, I talked with Konrad Kording, about the role of Bayesian stats in neuroscience and psychology, electrophysiological data to study what neurons do, and how this helps explain human behavior.

Konrad studied at ETH Zurich, then went to UC London and MIT for his postdocs. After a decade at Northwestern University, he is now Penn Integrated Knowledge Professor at the University of Pennsylvania.

As you’ll hear, Konrad is particularly interested in the question of how the brain solves the credit assignment problem and similarly how we should assign credit in the real world (through causality). Building on this, he is also interested in applications of causality in biomedical research.

And… he’s also a big hiker, skier and salsa dancer!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Trey Causey, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony and Joshua Meehl.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

Links from the show:


Please note that this is an automatic transcript and may contain errors. Feel free to reach out if you would like to correct them.


Well, welcome to learning Bayesian statistics. Thanks for having me. Yeah, thanks a lot for taking the time. I'm sorry for my voice today to you and to all the listeners but I am with a big cold. So that's how it's gonna be today with big news and hopefully not a lot of coughing but we'll see how it goes. I think four etc. is a remote interview so I can assure the listeners that kunhardt is actually fine, and I'm not going to continue.


Wonderful. Thank you.


Yeah. And actually, so I have a lot of things I want to talk about with you. Come on, and it's going to be super fun and lots of very cool neuroscience and psychology coming our way. But as usual, before we dive into all that, let's start with your origin story. So can you tell us how you came to the world of neuroscience and psychology and how seamless it was?


Well, originally, I studied physics. And in fact, I get my degrees in physics. I'm a PhD. I have a PhD in physics. And very early on, I started to be interested in areas beyond physics. So so as an undergraduate, I took all kinds of biology courses, and that was very important to me, because it didn't feel physics had the kinds of problems that I was interested in. In fact, it goes back even farther. So when I was a high school student, I participated with some fat friends in the national science competition in Germany. And we did biology we simulated the way trees grow. And in our way this simulating biological systems never went away from me. I always remained excited about it, starting like a year into studying physics in Heidelberg. I got into biology. About a year later, I asked the physicists if they'd be fine with me using physics applied to biology for my for my master's thesis, and they were very clear that that wasn't going to happen. And then I defected to Zurich and GUI simulating neurons in from the Institute of New informatics so I then also did my PhD.


Okay, I see so that was quite yeah quite rent in interaction with the world of neuroscience in the end.


Yeah, yeah. Absolutely.


And you're originally from,


from Switzerland? No, I'm German. Originally. I just went to study in Switzerland. I started studying in Germany in Heidelberg, and that's I grew up in shambles. But I enjoy a great deal is


where? Yeah, nice and high. Don't beg is that the city of Immanuel Kant or am I misremembering?


That may very well be true. I can't I don't I never get to know him. I know I can't, unfortunately, and my walking on reading was very painful. I mean, I don't I don't know how your reading was bad. I remember struggling with this right now. And


it was awful. Oh, yeah. Absolutely awful. It's just, uh, yeah, I had to read that when it was like 20 for my undergrad studies and didn't understand anything. Still, now I still have to rely on people who can understand him to try and understand what he was saying. So for sure.


Wonderful. And even speaking German doesn't make things much easier even. Sometimes it's harder to understand something that's been translated but it is hard. Even in the German original to eat the cat.


Well, it's not something good to hear. And something I found interesting also, you told me you're a salsa dancer, right? Yeah. So which kind of salsa Are you dancing?


Okay, this is a wonderful question because most people don't even know that there's differences so we are very much into onto salsa dancing New York City style. So so we make the big step on the second beat, whereas a lot of people make it on the fast and that's always been difficult if you're dancing with someone who likes to dance on the other one, but, but given that I'm a lead, it's at least a little bit easier because I can I can at least do something about it. I mean, like I when it comes to what I like I like a lot of the classical things. I like data to enter. I like Cecilia Coase a but but I like dancing all the all the Latin dances. It's just so much fun.


Yeah, I agree. On the same I really love dancing, salsa and Bachata


salsa. is great. Yeah. Marang is not so great. But But chatter and salsa. That's, that's what I get excited about.


Yeah, I love them to open to any it's a great, yeah, great exercise of like, physicality and also kind of meditative. So it's really it's really interesting and it gets you moving when you spend your, your days on a computer. So it's really something that's complimentary to me to the kind of activities we both have.


But it's also like just a wonderful way of celebrating life. Like there's something about like the energy and the excitement and kind of being together and celebrating the good things in life.


Yeah, yeah, no, totally, completely agree. So yeah, he was excited because it's quite rare that they get a guest who is also into Latin dances. So it was like, oh, I need to talk about that on the show. was completely a personal crusade.


You missing out on something important?


Well, I hope that one day we'll be able to do a Bayesian salsa workshop. Maybe that would be a fun thing where


someone might not know. So I'm part of custom motor control community. So I worked a lot on how people move. And if you go to the top Miller conferences, like neural control of movement, there's a whole bunch of professors that are really into salsa and have wonderful memories, ongoing dancing salsa with like a small crew of some 10 professors. So there is a kind is he into that?


Yeah, I mean, I can definitely see biggest. I mean it's also something that kind of amazes me with dance is that when you start you don't really know anything, and I think for me, was extremely bad. Because in France, people don't really dance. They just move with music. So it's not so it's it's it was kind of hard for me at the beginning, but it's something that always amazes my nerd brain is that you've got very delineated and precise steps. So very discreet, you know, steps, but once you put them together, it looks like a wonderful continuous function. And it seems like it's completely flow. It's a complete flow. It's incredible. It looks like a you know, perfect caution process line, but actually it's made of really small step sizes that put together look continuous but actually are not and I really love that like basically the art is like in combining those steps in as fluid and continuous flow as possible. And I mean, that's really the nerd way of looking at it, but I really love it and that's why I'm saying that math and science and art are not that far from each other.


Yeah, they they're not.


So well. I mean, we could do the episodes but that but that needs not be. It's not already. So let me get back on track in Su. Question. Oh, I always ask my guests at the beginning. And that's, if you remember how you first got introduced to Bayesian methods, and how frequently you use them today.


Yeah, I think I remember that quite well. So when I was a PhD student in zoic, I started thinking in a Bayesian way, but I didn't quite know that I was. And then I participated in this workshop in America and I met Bruno Olhausen. And I'm the kind of person who kind of invites himself to parties and in that case, I invited myself to Bruno 1000s This lab I was like, bro, no, I think your stuff is super cool. Can I come visit the lab? And he was like, Sure you can. And at that meeting, in fact, Horace Barlow was there as well. It was a one it was an incredible experience. And in that endeavor law kind of Bruno convinced me about the utility of thinking in Bayesian terms, and that's when I started being interested in formalizing things so then, later when it was time to choose where I would go postdoc I joined I joined Daniel Walpert in London. And they then got a much more formal formal training in Bayesian statistics Zubin Gara. Money was there. And Peter, Diane and of course, multiple others that kind of really helped me become precise about what I mean being a Bayesian.


I see okay. Yeah. So it's really, again, a question of meeting meeting people at certain time in your in your career. Yeah. 100%. Yeah. And now, actually, can you can you tell people, basically how you would define the work that you're doing nowadays, and also the topics that you are particularly interested in?


Well, right now, it's very hard to know what I'm interested in and it changes a lot. Right now, I continue being very interested in how brains work and how we could find out how brains work. I'm also very much interested in machine learning and neural networks, specifically as model systems of how brains work. And I'm also and that is much rarer. As a neuroscientist very much interested in causality. How can we find out that one thing makes another thing happen? But more generally, I'm broadly interested in brains and minds and how they work and how they interact.


Yeah, I mean, that that is fascinating. And I've, I've done already pupi suits, but but that's recently we've had on the show, Episode 77 was with Pascal well ish, diving into priors in how people perceive priors, and how that taints, literally the way they view the world. So of course, we talk about the work he did with Crocs and socks and the famous dress. And in an episode 81, actually just aired today with Shaka where we talked a lot about visual perception and how it relates to to prior So yeah, that series,


you know, Alan stocker has an office right over there. And we essentially shared an office when I was a PhD student in Zurich. So he was his desk was maybe 10 meters away from mine.


So I'm getting the whole Zurich neuroscience department on the show, basically. But I'm really happy to have that kind of series. This is these are topics I recently discovered about and I really like I really liked him it's absolutely fascinating. It helps me understand myself and other people better so I guess that's why these topics so much. Actually, he like if there is one, you know, big questions from your field that you'd like to see the answer to before you die. Do you already know? Like if you could have that wish, what would you say?


So understanding causality, even in a single nervous system, from my perspective would be a big thing. So let me unpack what I'm saying with it. So neuroscience is full of experiments where say we put in an electrode into a neuron, we show some stimuli. And we can then say, how does the neuron correlate to the stimuli that we show or like you can even say it's caused and I choose randomly stimuli from sunset. I see what's happening there. But the kinds of interactions that happen in the brain is I have one neuron making another neuron fire. We don't know how this neurons making one another fire, how that gives rise to computation and recognition and Bayesian statistics. And our experiments don't get it that because what if I want to know how one neuron influences another neuron? I kind of need to reach in and tap the first node because correlation isn't causation. And so being able to really simulate even a single simple nervous system, from my perspective would be a really, really big step. And I do hope to see that in my lifetime, and I'm starting to work very hard. To convince people that we should be doing that.


Well, I hope that podcast episode, this is going to help. basically send the episode and be like, Hey, here's why you should find that work. Wonderful. So to start diving a bit precisely into into what you're doing. So you're doing a lot of stuff but let's start by something that you call or don't if you call or if the field called the credit assignment problem. And in particular, you're interested in, in how the brain solves that problem. So yeah, can you tell us what this is about to be safe? No idea. Yeah,


it's a really big problem that occurs in learning. So imagine you do something wrong. Like imagine I do something wrong. That happens a lot. For example, in South, let's say, my foot is where it shouldn't be and my wife steps on it. I made a mistake. It's clear that I want to learn from that I don't want to make that mistake again. And with that comes then a problem which is like which changes and Conrad's brain what make him better at that specific problem in the future. And so at some level, like, whose fault was it, and how would we need to change that part of Conrad's brain so that he doesn't step where his wife needs to step at that point of time. So this is a very generic, which is, I have an information processing system, my brain, a mistake happens, which changes would have made the mistake not happen. And that's the credit assignment problem. Like who was it? It's like, like a detective and like, whose responsibility was that? And now, I see. And this is the place where my machine learning hat meets my neuroscientists hat. So it's a general problem like that accounts. You can't avoid it. There's some mistake some pieces could help rectify that mistake. And in machine learning, the way we do it, we use gradient descent where basically saying, we are calculating for each weight in our system, how much did this weight influence the output and then we move all the weights into the right direction, and in a way so gradient descent what we do in machine learning is one version of the credit assignment problem. But credit assignment could go definitely no way. You could say maybe it's no gradient at all. We're just figuring out who's the worst neuron in the brain and then we're telling that was neuron in the brain to not do that again. So there's, there's all kinds of algorithms that one could use, but it's an unavoidable problem that any learning system has including those that are or become Bayesian.


Okay, any stats related to basically the bias that we tend to have, which is, when we succeed, for instance, in something we attribute the success to our skills, but when we fail at the same task, we will attribute the failure not to asleep, lacking the skills but actually to the circumstances or to someone else.


It's a great question, and I can't answer that. So you can say credit assignment. I can say a lot about it. Because it's precisely defined at the level of the computing system. how we feel about it is kind of at the level of cognition if you want, and I'm 100% with you that there's these biases now where we believe that if something goes wrong it's someone else's fault. If it succeeds, it's us doing that. But for that, I can easily tell an evolutionary story where you can say kind of like seeing others as being responsible might be good for you. But I don't think I can tell a mechanistic story about it now like it's clearly this cognitive effects are real, but I but I couldn't say how they come about mechanistic key.


Okay, yeah, because I'm guessing you would need to do the experiments for that, if that even would be possible.


Yeah. And we don't even know what to really look for. Like if you have these biases at the cognitive level. There's an infinite set of potential brains that could implement those biases. And we don't have the experiments at this point of time to like, really see how the biases come about. And it's possible it's an interaction of everything. In which case there can never be a story that as humans could actually tell about it.


Yeah, let's see. Yeah. Okay. Can kind of depressing. And so, I think you just, you kind of said it in defining the assign the credit assignment problem, how it relates to how we should assign credit in the where we're on in the real world. But I think it's also related quite a lot to causality, right, quite directly, actually. So can you tell us more about that and maybe if you are doing some word, work around that in your head


yeah. So let's talk about the relationship there. So if you're long, it means that something doesn't go as well as it could and then you're adapting something inside. What you want is that the changes that you make in the brain ultimately gives rise to better behavior in the future. So this is a causal question like, which change what make that thing happen. Now, what we do in machine learning is we take the insight of the brain if you want the neural network, the neural network itself is continuous so we can calculate everything continuously and we can do gradient descent. The outside world is often discrete and I like I do step on someone's toes or I don't step on someone's toes. So the outside world isn't differentiable in the way the inside world is. So usually, then we have these systems that kind of have a special thing on the inside. And a special thing on the outside. And, but what In either case, what we want to estimate is how a change will make future things be better. What we generally assume that the future will be just like the presence. So then it becomes a question which, if I do something wrong, which pieces in my brain gave rise to it, so that I can make the changes that that will not happen again, if I would be confronted again with the same situation?


I see. Yes. So that's, that's definitely some causal language here. They are you like what's the current state of the work here? That you're doing in your lab?


Yeah, so So we asked a lot how we can, how brains can in a way figure out about causality. One approach is for example, you can say, if I spake, how do you find out what the causal effect of the spike is? So like, I don't salsa I have an extra spike in some neuron. Does that make me better or worse? This is very difficult to calculate. But you can still say, Well, what would be the strategies that we could have there? So we could compare neurons let's say the overall reward like how happier how, and when I'm dancing on a maybe it's represented by dopamine or something. So I dance. One possibility how I could learn to do better is I could say my, the neurons in my brain do experiments face some of sometimes they produce a few extra spikes. Sometimes they produce a few fewer spikes. And then they see if when they produce extra spikes in that situation, if like it leads to more reward and if they produce fewer if it leads to less reward. And ultimately, if a neuron in that situation, is correlated with through what it means that then you should be more active in that situation like and then you can say, this way neurons could ultimately learn to do the right thing and there's there's a paper V left yet and Sebastian song that proposes that a song Bad learning could be working like that. But the problem is you need to have lots of noise into the system to find out what works and what doesn't work. So what we what we did is we said well, that's actually not true, because you can say, if I compare times where the neuron spikes and time, so the neuron doesn't spike. That could be very different. And I like it and often doesn't spike maybe a wildlife sleep and comparing those two doesn't even make sense. But what you can do is you can say, well, let's instead comparing spiking with non spiking, compared times whether noon almost spiked and the known and the times were the new and just badly spiked. You can say if the new ones very far from the threshold, let's just not learn at all. And in that case, you can say we have something that's that if we're very close to the threshold it's as if it's random. And if it's random, then we can find out what's good and what's not good. So, to gather with Ben landstar, we wrote a paper that just I think it just came out and plus computational biology that then propose us how neurons could actually be doing that and what's interesting is, it leads you down a route where things are relatively biologically realistic. And all you need is then that if you're really far from the threshold, you don't have plasticity, but guess what you're like hyperpolarized then and therefore there will be very little calcium in their cells and then you just don't have any plasticity. If you don't have any calcium at all. So then it's like reasonably compatible with that. So what we have with that is we have a lawn wall that in a way, combines combines causal inference and gradient descent along with like, realistic biophysical properties.


I see and why why would great in decent help you hear more than another method of modeling.


So gradient descent is a method that tells you what the causal influences of every internal variable on the output and then it does learning in that direction. So why is Why is gradient descent helpful? You can say if I want to learn anything, I need to change my brain. I need to change my weights if you want. Now, there's an infinite set of possible ways of solving gradient descent and I'm like, I couldn't find the new one that is most often most important, only adapt that neuron. I could adapt all neurons by the same amount or I could have a reunion go down there gradient. So what's special about gradient descent? It turns out that if I want to produce a certain fixed amount of improvement of all learning rules that I could have, gradient descent is the one that gets me a fixed amount of getting better, with minimal change, otherwise, now you can say well, but why would we covet other changes? Well, if we change a lot of properties of the brain that we don't need to change by doing something that's very different from gradient descent. Then it's like we introduce noise in the brain. Introducing noise will have the effect that it makes you worse at other things. Being worse at other things is a problem because it's interference between different types. And so the We the thing that is special why I like gradient descent as a way of thinking about brains, is that gradient descent is of all learning rules, the ones that least messes up our brain when we learn new things.


Okay, I see and how he said so I'm curious about the technical side of those models. Now. What do that those kind of models look like? Are those generative Bayesian models? Are they more akin to deep learning models? Yeah, can you can you tell us a bit more about that?


Yes. It's complicated. As so let's so in general, that class of models that focuses on credit assignment is generally related to deep learning. Now, these deep learning models relate in two different ways to Bayesian statistics. One of them is that you can say what happens internally in both schools can be viewed as being something Bayesian where you can say the nuance, maybe estimate some weights, we have some prior knowledge about the weights and we can think about it in that in those terms. But the other one is if you have a system that lands with gradient descent, say in your network, it becomes like, it becomes indistinguishable from a Bayesian system as soon as you had enough training data. Now, why? The best most efficient behavior in most situations is Bayesian. That means that if I give you enough training data, you will learn to perfectly approximate a basin system. And there's work from multiple groups that looked at that. And it turns out that it's very easy to learn to be Bayesian. Let's say you don't need to be bombed base yet. Because you're in a world characterized by uncertainty any as long as you learn from feedback with credit assignment, you will eventually become based on your behavior will look basin, which leads us to a misinterpretation of Bayesian statistics. Often sort of like Bayesian psychophysics people. So in neurosciences, this branch of people that focus on Bayesian ideas, can we understand brains from Bayesian perspectives? What they look at is behavior that shows that human behavior is Bayesian and I have contributed a lot to that field. All kinds of human behavior, sir, approximating bases, the optimal base sensation and a really good way. And they say, oh, but like, if behaviors base and then shouldn't the brain kind of have base built in at some level, but if it's easy for the plumbing system to learn, Bayesian statistics, it means that it doesn't need to be built in it's maybe so easy to learn that there's no point of even building it.


Yeah, I mean, yeah, because also you could see that patients in thinking is kind of an emergent phenomenon. So it doesn't have to be in the brain to actually be used, if you need like if basically what you need is the ability to take in that data and change your mind based on that new data. Well, then you don't need that actual ability of knowing how to derive Bayesian from Bayes formula to do it, right. You have the building blocks already and you just need to


hear this right now. Like let's let's think through what's really happening in let's look at based in queue combination, you can say, I have some prior aggression somewhere in space. That's what I did first, for some of my postdoc work. I expect say maybe my head to be in some location or ball to be in some location or something like that. So I have a prior question. I have I have an observation of things that I see the ball fly, or like I see my hand in the periphery or something like that has some uncertainty, the likelihood function probably will look like a Goshen. Now, what's the effect the effect is that the optimal behavior that minimizes variance or whatever metric you want to have there is going to be a combination of my prior belief, the location of it and my observation, and it will rely the more on the signal the less uncertain that is. So like if I see my hand really well. Maybe my prayer isn't very important. If I know very well where my hand is, and maybe my vision isn't important, and you can interpolate between them. Now, we're talking about a linear combination of two things. And we're talking about in linear combination where the importance that you have will depend on those two irrelevant uncertainties. That is a very simple function. In fact, a very simple function that will have two parameters I can directly learn so it gets to be it's not a hard problem to learn.


Yeah, yeah, that's, that's super interesting. And that gets that can get it's, it's a bit like physics. It can get quite philosophical quite fast when you go to the frontier of that kind of science. But but


it's empirically not like you can I can build a neural network with like 10 neurons, and it will do perfect Q combination, Bayesian two combination you can like, run the same expound on that level. Bayesian on that level neural network that's just been trained to do good estimates. And it will for all practical purposes look like you or me when I do the psycho physics Summit.


Yeah, and if it's the same data, then like the conclusion should be the same all the time, right? Even though you start with different neural networks at the beginning and the end. The conclusion should be the same.


This right now, like the big difference is that when we build a Bayesian system, or we build in knowledge about the welds, and then like if I write, say, acute combination paper, I say, Well, look, I believe that there's got a nice on this, and I believe there's caution lies on this. And now let's build it we build a normative model we compare the normative models performance with human behavior, they're very similar. We write a paper about that. But the alternative is, if my description of the world is actually right, a neural network will learn to do absolutely perfectly the same thing because that's the right solution for that. So the difference is rather in one case, it lives in the data. In the other case, it lives in us thinking now, like if we're Bayesian, then it happens in our head, we think about the world in a certain way. And we build that into our model. And then the rest is just math. And deep learning what we do is rather instead of thinking so much about the world, we're like, he has lots of data, it still contains the same information, and therefore it's free, it will need much more data, but it will come up with the same solution if that is the right solution. And out of that, you can also see the failure modes of base and thinking like as soon as you if I make the world more complicated, I can be okay Alex, how do you think about like me localizing a ball in space and you'll be like yeah, let's come up with some Gosh, and and so forth. And then I'll be like, how do you replay like, how do you ask a question, but how I should grab my bottle here. Now that is much more complicated, like you'll be like, Okay, that's like, where's the Gosh, in here, there's like a structure there's a bottle like kind of Adobe using your prior knowledge about bottles and you'll be you might be okay, and here's a data set of bottles, but I'll be like, well, but how can you maybe shed some water? It could be something else like, it's, in one case. It's in the deep learning way of thinking you'd like to throw a lot of data at it. And whereas as a Bayesian, you're kind of forced to write down things and the writing down things gets to be very complicated in a complicated world. Yeah.


Yeah, for sure. That it can get pretty complicated pretty fast. So one thing I lie and I find really interesting in the work you you folks are doing in your leg is that so you have that kind of micro level kind of studies where you use electrophysiological data to study what neurons do. So I think that's the example you just talked about. Right? And then you have these more macro level kind of studies where you try to explain human behavior, which as you're describing on your labs website is basically studying what all neurons do together. Yeah, and that's always fascinating to me, because in a way it's really weird these neurons, at least from what we know don't have a conscious of of them existing, but then if you put neurons together, at least for Homo sapiens, we like that micro level has a conscience of existing so like to me it's super weird. First, but it's kind of an aside because I don't want to nurture yourself too much for this question. I basically, can you take an example to illustrate that kind of study that you're doing more at the macro level and maybe related to the micro level can sign up things?


Okay, this big gap in between, but it relates to the discussion that we had earlier on. So So let's maybe break down the macro and the micro. So there's the macro level. I look at the you doing something and I can get other Bayesian statistics. The worse Alex's vision is, the more he should rely on what he feels and less than what he sees. And we can play that game I going to give you some glasses that are like a little bit distorting your vision and you'll rely less on vision and I give you one set of really very bulky and you'll bit later you'll be using a tiny bit and I can make you blind and you'll rely entirely on your pride. Okay, that's the first thing. So that's the macro the macro part is logic Kelly wonderfully like it's the logic of the macro expanse of the following nature where they say, here's a problem that humans have to solve. Here is how humans here's how the optimal solution to that problem would look like. Now let's compare the optimal solution to how humans do it. If we find them to be similar, we kind of conclude that basically humans solve the problem that occurs in the real world in a good way. Okay, wonderful. I'm very happy with that logic. In that logic, if you listen to it, there's no connotation how the brain does it. Nothing. For all that we know if we if I were that like macro hat. It could be that it happens by you doing like literally equations in your head, or it could happen by divine intervention. I could be happening to the interactions of lots of neurons, but logically, I'm not making a commitment to any of them. So that's the macroscopic level. And then there's the microscopic level where you might want to ask, Well, how do unions interact to make those things happen? A lot of the clean thought in Bayesian statistics is at that macro level. The macro level is logic usually super clean. And I remember an experience that I had with the late David No, I think we had dinner together. So David know, one of the really, really great bass in psychophysics people who had tremendous influence on me and the entire field. And I remember going to dinner with him. And I at that point of time was very much in the micro into the micro approach. How do neons do it? And he explained to me like, look, the micro approach is really complicated and maybe impossible. You want to be clear about which hat you're wearing. Basically, if you wear the macro hat, the normative thinking hat. You don't need to subscribe to any specific view of how the brain works. You're just saying that's a real world problem. That's an evolutionary really all learning reason why you should be good at that problem. Let's compare good at the problem with how you actually do it. It's the cleanest modeling framework that exists in human behavior or brain science. So that's the macro view. And a lot of people in that area will kind of say the macro view is logically possible and the micro view is maybe very far away from it or impossible. So let's talk about the micro view. Like we have the macro view. What's the problem? How does the optimal solution look like? Let's compare it with people that micro views rather, I have neurons in my eye, and those neurons in my eyes send signals to the lateral geniculate nucleus, and they do so in a somewhat complicated way. And now LGN communicates with primary visual cortex and in some way that might be complicated. Now, primary visual cortex goes back to the LGN, the lateral geniculate nucleus, but also goes forward to me b Area V two. There are millions upon millions of neurons in primary visual cortex and they do incredibly nonlinear interesting things. Every paper on V one discovers new interesting things happening in v1. Do we understand how they jointly produce that we're successfully moving our hand to grab things? No, we have no idea how that works. And so in that bottom up view that like the micro purchase, you'd call it, there's this bridging problem like I can, I can tell you a lot about neurons in the retina and I can tell you a lot about muscle cells, but I have a really, really big difficulty of what happens in between. Now, let's make the link as tight as we can to Bayesian statistics and I should mention here together with Li Mala, we did a lot of experiments get money from NIH to basically asked how does the brain what happens in the brain during while we deal with uncertainty while we do Bayesian things? And what we found is that lots of things happen and none of them is simple. So so the MicroView in, let's say, year 2000 to 2002. Back then, a lot of people were interested in what is how can humans do Bayesian things? And they came up with simple ideas. We just need the small number of rules and it's going to help us figure out our in it's going to allow the brain to do Bayesian statistics. It's just the brain desert space in statistics, which we know from behavior, but it doesn't do it in the kind of simple way that as neuroscientists want to have that in there, therefore, all of the all of the Bayesian ideas of how the brain could be doing kind of died like over the last decade or so like it was very, very popular. If you went to cosine, lots of papers in that area. It's just they don't tend to have stood the test of time. People were like, yeah, the brain could be doing Bayesian statistics this way, or Bayesian statistics that way. It's just like the brain somehow does it but it's not one of those ways. You can say that same criticism, of course is also true for credit assignment ways. of thinking, No, but there's one big difference, which is that this idea of learning by taking the parameters of the system and finding ways of improving them, that's the only idea that has ever worked in machine learning. So I feel much more positive about that. Then let the brain say, Does Bayesian integration by what probabilistic population code or sampling code


it Yeah, yeah, thanks for these for this long and detailed answer. Lots of lots of things to think about. Yeah. And I mean, in a way, that's kind of that's good that it's not that easy to borrow, right otherwise that I guess that would be boring. But I think it's like sometimes I'm guessing it can get quite complicated. Right. So how do you I mean, kind of a more general question I have here is how do you kind of keep keep track of all of that? You know, all that moving science and don't get bogged down in in either too narrow of you are kind of a nihilistic view where you would think, Oh, that's too hard. Basically, we're not going to learn anything about about how the brain works. That's actually actionable.


So So let's produce a level of clarity of what we mean about that. What do you mean, with understand how the brain works, because that simple sentence is hiding a lot of the things that are really important, in your sense, what does it mean for us to understand how the brain works?


Yeah, it's in and also, it's so weird to ask that question, because, I mean, your brain is asking the question, it's, it's, it feels kind of, I'm asking you the question and answering it at the same time.


So so let's let's dig a little bit. Almost every neuroscientist will say that studying how the brain works. If you push the neuroscientists, what does it mean for you? To understand how the brain works? You're mostly drawing a blank because we're not asking ourselves that question. So, let me highlight how I mean different things depending on which hat I wear. So let me wear the macroscopic Bayesian hat of that. If I wear that hat, my answer of how the brain works is the brain solves the problems that the world gives it in a good My question is, what is the set of things for which it does it in a good way? And what's the set of things for which it doesn't do it in the right way? So in that sense, I asked how the brain works, but I don't ask the how in the definition of what are the mechanisms that go give rise to it? I'm asking how in terms of the computation kind of does it do the optimal thing that's kind of what I do if I were macroscopic based set this another which is the view of how does the brain compute this? How well you could say, it goes from the retina it goes to the one it goes to like and kind of like there is a how does the local computation give rise? To what the macroscopic person set? Now, that version of how does the brain work doesn't necessarily have an answer? And I like to be clear about that. Why is it possible that there's not an answer? Well, the brain has 10 to the 15 parameters. It's like 10 to the 10 neurons that interact with one another. If I could give you all those 10 to the 15 weights, I'd be like, Alex, here's the 10 to the 15 weights that like define comrades, and I give you the full simulation, I give me a big hard drive and it simulates Conrad and it talks like Conrad and makes the same bad jokes. It loves dancing salsa, at least pretends to love salsa, dancing salsa. What would you do with it? You would have no idea what you do with it because what would you do with those 10 to the 15 parameters? No, you cannot simulate it. You cannot do all the experiments that you would want to do on a human to ask if they're based in on on that human being on that simulation, but it doesn't produce an understanding and in fact, it's possible. There is no strong compression of that where you could say yeah, I can make it 10 times smaller than I attend to the 14 parameters. And now it like is 99.9% as good as convert and like, now we can get another like factor 100 And now it's like 90% similar to Conrad but but that that's not really an understanding. So it's possible that the micro understanding framework will entirely fail. Now then there's if the if the micro understanding part fails, it's still possible that we can understand learning, let's say it's possible that credit assignment is somewhat similar a simple for example, because it does a gradient descent or something like that. There's something complicated but not super complicated. If you have a simple learning system, after learning, it will still be very complicated because the world around us is complicated. Now, our listeners can hit you can see that I have blue hair but I do have blue hair, and it's going to the fact that I have blue hair. After our interview, Alex is going to be smeared over like millions of synapses in your brain. How are you? How can you describe it afterwards? And even if what you do is something very simple, simple, maybe you're just too heavy on learning. The result of heavy and learning in a world that is full of arbitrary stuff is arbitrarily complicated. So that so you can say now if I'm into into into credit assignment, I might say, Well, look, I don't know how long or how Conrad's finalized brain works with its 10 to the 15 parameters. Because he's seen so many totally arbitrary things. I can compress it because Conrad's brain is full of perfectly arbitrary like weird things that like no one has any idea how I got there, because to know that I'd need to basically have the whole video of how Conrad get here. But it's still possible that the learning mechanism is very simple. So that's why I say like Richards and myself are pushing this like credit assignment centric view where we say, Well, look, we're kind of convincing ourselves that the brain after learning is too complicated for us to understand that the way how we learn might still be possible to understand. Now you can say is this a defeatist attitude? No, like when we ask that question of how does the brain work? We do that with a certain set of assumptions and we go with a definition of what we mean with it. It's possible that after learning you can understand it and learning you can understand it. It's also possible that learning you can understand it because in reality, you don't just learn you learn to learn and through learning to learn. The way you learn now is also arbitrarily complicated and alike. I'm not saying that we definitely have


Yeah, it's it's it's super fascinating. And thanks for those clarification. And so time is running by and there is a topic I want to I want to talk about with you because, like, all these discussions about the brain, of course, it's making me think about, well, all the brain diseases and of course things like Parkinson's on side murders, and things like that. So like to which we don't have a lot of insights from what I understand right now. So I don't know if you are personally working on that, but I know that you are working on biomedical biomedical research, and of course, you're doing a trying to do some causal inference in there. Because that's the name of the game. So can you take an example to first how how you do that basically in that field.


So wearing my medical machine learning hat, I'm, again a very different person. So you can say in the things that we've talked about so far, I'm trying to understand I'm trying to see the logic of how things work. i In other words, if I were a pure engineer set, so you can say, I want to give people who have lost an arm, a prosthetic arm that is really agile that works great. For that I need to be able to go into brains and tell you what they want to do as a function of time. And of course, we're doing experiments on monkeys because the human experience suggests we complicated and collapse. There you can say if I were that engineer, I said I don't need to understand how the monkey's brain works. All I need to do is figure out what the monkey wants to do at every point of time now, like it's a very different problem that I'm trying to solve at that at that point of time. And for that there's no connotations to how it works. It's just a distraction. I just want to get out the information combined to do get machine learning and then use it to maybe control a prosthetic device.


Oh, I see. Okay. And so do you have an example for us and also maybe something that we can put in the show notes for, for people who, when I dig deeper on that on that front, because that's fascinating.


Yeah, and I can, I can give you a Bayesian and a non Bayesian version. So we have a packet who's surfing GitHub. And that package on GitHub uses just modern machine learning this recurrent neural networks and things like that. We should build transformers into it at some point of time, but like it's from a few years ago, and it does. You could call it street fighting machine learning you're not trying to understand anything, you're just trying to like, really get really good prediction of what the monkey wants to do it at a function of time. And there's a nice GitHub package. That is called neural decoding and a lot of labs use that for production questions. And what do we take from there? It really helps to do proper modern machine learning. It works much better than maybe the linear decoding that matches the field date and the past. So that's one thing. Here's a completely reference.


Definitely we need to know if you can put that in the show notes for sure. For listeners, I'm sure they will be happy to check it out.


Absolutely. And now let's talk about a Bayesian version stats. So, this comes from a project where we said you move your hand through space and if I have a decoder I want to use the kind of waste how you move through space as a prior and the data we get from neurons as a likelihood. And I want to combine a model of how you move with basically than your old data and the papers called mixture of no trajectory models, where we basically say there is this sudden movements, a sudden statistics, so we have prior knowledge about those, but we also have new old data and then we want to use the neural data to basically find out which trajectory the animal wants to do at that point of time. So in that sense, we have a prior that comes from just a lot of movement data and the likelihood that comes from that. Yeah. And guess what? The Bayesian approach is much wiser than the non Bayesian approach. Why is that the case? Because we can with the other approach, we can use a lot of data and use the data effectively. You know, do you know what we call the better lesson in machine learning?


The better lesson,


the bitter lessons that it's not sweet, it's better. And the better. The biggest No, I know generalizing finding in machine learning, which is if you give me a new prompt, let's say the code hand movements based on my key data. In any new problem, when we start, simple algorithms like linear regression will do badly. And then, as humans were a scientists now like, I get paid to be a scientist. I come up with these ideas, and I build it and my students do it and they're brilliant and like it improves things. So we go from a simple model and we build in all those clever things that we have built it in, and we do better than time passes. Computers get bigger, we get more data, we get data from enough monkeys. And in the end, a general purpose algorithm not that unlike the linear regression that we might have started in the beginning. But once we have enough data, yeah, that data with a general purpose learning algorithm will work better than if I build something. Bayesian statistics for me as a field is very much associated with people saying here's my model. That means that they're always they're always at that middle stage step. In machine learning, we find that whatever it is, that we solve by humans will eventually be killed by General Purpose algorithms, which means that all our brilliance in the end is in vain. Because Because in the end, just enough data wins out against us being clever. And so the bitter lesson, as it's called, is that is the deepest meta insight that exists in the machine learning field. It just means if you have enough data, and let's briefly talk about where that's coming from, like you can let's talk about Q combination. There's a ball I want to localize it. I have a prayer, I assume it's a Goshen and I have a likelihood function. I assume it's a gosh and I model each of them. But guess what, in reality, it's not quite there. Gosh, now you can say okay, let's fix that. Let's like give it kind of four parameters. It's like a bit like a Gaussian with like some kurtosis. And we can we stay state basins. Well, even there, if you give me give me enough data, that's not true. But on top of it, I assume that the prior and the likelihood independent in in reality, kind of the way that ball looks changes, like where I should be expecting the ball and like all of a sudden, like, it falls apart. Like all the assumptions that we make, then like fast are approximations kind of really quite good. And then if you keep digging, they're not which means that if you give me enough data I can basically get all those things right. And I can get the things right that you as a human can't even formulate and like you might miss certain things if I asked you okay, well Alex, let's let's write down everything you know about baseballs. You'll write down a few things and you don't know that kind of getting like slightly dirty on the ground likes, slightly changes the way it flies to the air and like that then like ultimately something that like can be used and wouldn't be otherwise. So all of a sudden kind of like this way of thinking cleverly, you know it gives us understanding which is wonderful, but it also gives us a failure mode, which is our understanding will never be complete. And if you give me enough data, I can use the data to basically build something that is better


yeah, I see. And I'm I'm a bit surprised that you can get I mean the question is always also like, how much data is enough data to get to that state? I'm a bit surprised that you can get so many so much data actually. That that's online wasn't expecting from that field.


On what not like on all things you can progressively get unlimited data take perception. Like in the past, we could maybe start a video now we can use a million hours of video and of course we're better now because a million hours a video that's a lot of data a lot more than a human being can write into in even very, very clever model. And that is happening in all kinds of domains take baseball and like we know now like the pitches and bats of every professional baseball game played in the entire world and there's a fun, Bayesian paper by Justin here from the lab who like nicely shows that baseball players can be well modeled as Bayesian decision makers. But guess what these the players will also like watch lots of videos so that they have good priors and they kind of maybe do something much more like machine learning where they're like, Okay, let's look at like a million videos on what happened there.


And yes, that's actually fascinating. I just read recently a book by I think it's Devin did David Epstein to sport team. I'm gonna put that in the show notes. And that was actually fascinating. And he talks about the the book is a bit old now. I think it's from 2013. So I'm guessing the science has evolved quite a bit. But basically the years sporting he talks about baseball and how basically Major League Baseball batters could be basically confounded by if you throw them a softball instead of a baseball, classic baseball because their priors are so ingrained in so tied to the baseball, they would just prefer it to be normal players if you're using a softball instead of a baseball. So that's very cool. So so if you can. Yeah, I mean, the book is is really fascinating. And really well written in very, extremely good scientific journalism. Like I wish it was always the case. If you can put also that that paper in the show notes for the listeners, that'd be that'd be awesome. Okay, wonderful. So before letting you go, because you need to go in a few minutes. You told me right, how long do you still have? Yeah,


like maybe we should wrap up in five minutes or so.


Perfect. So let me ask you. Yeah, well, let me let me ask you the last two questions. I think we've done quite a quite a good thing. People want to sketch question we forget and then and then there's the question. So is there a before we wrap up the show, because you've already been very generous with your time, I mean, I can still talk with you for like one hour is so fascinating, but we have to wrap up in a bit because you're busy man. The first thing you the last two questions are asked every guest at the end of the show. Is there a topic I didn't ask you about and that you'd like to mention?


No, I think yeah, we talked about all the relevant topics. It was it was a fun discussion. Yeah.


Okay, cool. Awesome. Well, cool hat. Now is the time for the last two questions. Ask you again at every guest at the end of the show. First one, if if you had unlimited time and resources, which problem would you try to solve?


If I had unlimited time and resources, I would start figuring out how mechanistic a nervous systems work by doing large scale perturbations and finding out how every neurons output influences the fan, how every neuron how every input of every neuron influences the output of that neon.


I'm not surprised by your answer. I can see from your passion that it could get from your patient that it would be something related to what you already do.


Actually. Let me briefly interrupt you because this is a message that I think is very important for everyone to hear, which is as scientists, we kind of leave a lot of money on the table if you want because we could be doing we could be an industry we could be producing products but we decide not to do that because we're so curious. We want to know how the world works. And what that means is that at some level we need to work on the things now like what we get for for staying academics is that we get more freedom. That freedom is useless unless we work on the things that we really want to work on on the exciting things. Scientists focus so much on the important things for careers and I get it it's important to like survive and have a career and all that. But the careers are secondary the careers are the things that we have. So that we can ask the questions about the universe that we really want to. I think every scientist needs to rethink on a regular basis is the thing that they're working on. Really the thing they should be working on the thing that would be so cool and ultimately make a big difference. And that's why kind of like yeah, so of course I'm walking on the things that I find most interesting right now. Because otherwise I would be doing the wrong thing then I should work for Google instead or something like that.


Yeah, that's yeah, thanks for for these. Yeah, for sure. And I'm sure listeners have heard your passion and dedication that'll fix it. So. So last question, and then it's going to be time to close up. If you could have dinner with any great scientific mind dead alive or fictional? Who DP


that is very, very difficult. Um, I don't know like I like I like engineering. Maybe Ironman. But I also like society, maybe maybe Jeremy Bentham. Jeremy Bentham would be a wonderful choice. Jeremy Bentham was influential, very influential in thinking about welfare and utilities and I think Bentham I would have wonderful discussions on how society should be structured together with him. I think Bentham would be a pretty good guest for me.


Yeah, well love it. Let's have a nice dinner between the four of us definitely. Cool. So let's do a fake goodbye now. And then we'll stop the recording and I'll tell you what to do with the audacity tracking. Okay.


Okay. Wonderful. Okay, fake goodbye. Everyone said


well, thanks the left code, right. That was that was absolutely awesome. Thank you. Dan. Kezia, thanks for having me. Well, as usual i i put resources in the link to your website and the different studies that we've talked about in the show notes for those who want to dig deeper. Thank you again, kunhardt for taking the time and being on the show. Great, thank you. Okay. Yeah, I know the stock Fine. Bye. Oh, yeah. Yes. So, File export, export as WAP W A. And then you can save it wherever you want on your computer. The meta data


how long 16 Is that right?


Yeah. 24 Oh,


hello file. Let me try that again. Safe. No Export Export as WAF it defaults to 16 bit so 24 bit


24 bit yeah. Okay, and you can stick that in your you want on your computer and safety. Yeah, anytime you anyone, any time your head. It's a big file. So you're gonna have to use whatever you want. Dropbox Google Drive, we transfer.


Hello, I'm currently what what's your email address?


So it's, I'm gonna I'm gonna right


now I don't have


learned base stats@gmail.com


long base stats. Here to explain so I couldn't find it. Yes, yes, yes. No, no, no, no long beta. Base. Yes, you know, Laplace. Exactly.


And they rate carding. There I attach the.py file. Name contacts Wi Fi here. Good. 560 megabytes. What? How long it takes, it's it's just so good. It didn't send it I need to use a different


Yeah, by email. It's too big. You need to send that like you need to put it on Google Drive or Dropbox or we transfer and then send that to me, because email is it's a big fight.


Yeah, one second here. Why can't I find the files now? I just had it.


Wasn't here Here. Here. Oh, okay.


It's gonna be sent as a Google Drive link






so hello, now I'm confused. Your zooms calcium, Alex and your email calls to PSC Ma.


Because that's the email from the that the shows.


Oh, I see. I see. And there's like yeah, pianissimo is like someone else.


Guessing Laplace is actually the French mathematician from the 19th century


actually develop Okay, so that's, that's actually that Laplace? It's not someone first, who for a random reason has the same name. Okay, good. Okay, great.


Okay, it's trolling people.


It's gonna take a moment. Okay. Wonderful. There's a few things you wanted to put into into the show notes. Can you send me a list as email like what what information needs to go into the shop note? Okay, cool. Okay, great. You should know the recording. Yep, that's it. Okay, awesome. Yeah. Okay, great. Good to see you.


Keep in touch as soon as you have the recording. Let me know. I'll send that to editing and then your episode should be oh, by the way.


Have you have you thought of Riverside FM it's really awesome.


Yeah, I saw your comment. I'm actually gonna actually looking at that right now. And it looks awesome. So I think


and the other one is, if you've never tried it, the script is amazing. So the script is you take a recording and it automatically recognizes the whole text. You can then it then it gives you basically text into a text editor. If you delete the text, it will automatically delete the set text in the video. So if you'd like you can basically go through the text and just take away pieces of the transcript. And I see it like totally like I like like it's you can basically edit out like if there's a few sentences that you think aren't useful. You're just deleted. It's like basically the kind of production that you'd normally do. You can do it like yourself and like five minutes is like amazing. It's like so cool for those things.


Yeah, it's good to know. Yeah, reserve site looks amazing. I'm gonna I think I'm gonna try that for the next course to be


reversed. That's very good. Okay, cool. Good to see you. Bye. Bye.


Thanks a lot. Go ahead and it's let's keep in touch. By Allah

Previous post
Next post