#71 Artificial Intelligence, Deepmind & Social Change, with Julien Cornebise

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

This episode will show you different sides of the tech world. The one where you research and apply algorithms, where you get super excited about image recognition and AI-generated art. And the one where you support social change actors — aka the “AI for Good” movement.

My guest for this episode is, quite naturally, Julien Cornebise. Julien is an Honorary Associate Professor at UCL. He was an early researcher at DeepMind where he designed its early algorithms. He then worked as a Director of Research at ElementAI, where he built and led the London office and “AI for Good” unit.

After his theoretical work on Bayesian methods, he had the privilege to work with the NHS to diagnose eye diseases; with Amnesty International to quantify abuse on Twitter and find destroyed villages in Darfur; with Forensic Architecture to identify teargas canisters used against civilians.

Other than that, Julien is an avid reader, and loves dark humor and picking up his son from school at the ‘hour of the daddies and the mommies”.

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, Adam Bartonicek, William Benton, James Ahloy, Robin Taylor, Thomas Wiecki, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Elea McDonnell Feit, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran, Paul Oreto, Colin Caprani, George Ho, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Luis Iberico, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Aaron Jones, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, David Haas, Robert Yolken, Or Duek, Pavel Dusek and Paul Cox.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

Links from the show:

Julien’s website: https://cornebise.com/julien/
Julien on Twitter: https://twitter.com/JCornebise
Julien on LinkedIn: https://www.linkedin.com/in/juliencornebise/
Julien on Scholar: https://scholar.google.co.uk/citations?user=6fkVVz4AAAAJ&hl=en&oi=ao
Stable Diffusion is a really big deal: https://simonwillison.net/2022/Aug/29/stable-diffusion/
LBS #21, Gaussian Processes, Bayesian Neural Nets & SIR Models, with Elizaveta Semenova: https://learnbayesstats.com/episode/21-gaussian-processes-bayesian-neural-nets-sir-models-with-elizaveta-semenova/
pymc.find_constrained_prior function: https://www.pymc.io/projects/docs/en/stable/api/generated/pymc.find_constrained_prior.html#pymc.find_constrained_prior
LBS #50, Ta(l)king Risks & Embracing Uncertainty, with David Spiegelhalter: https://learnbayesstats.com/episode/50-talking-risks-embracing-uncertainty-david-spiegelhalter/
LBS #67 Exoplanets, Cool Worlds & Life in the Universe, with David Kipping: https://learnbayesstats.com/episode/67-exoplanets-cool-worlds-life-in-universe-david-kipping/

Abstract

by Christoph Bamberg

Julien Cornebise goes on a deep dive into deep learning with us in episode 71. He calls himself a “passionate, impact-driven scientist in Machine Learning and Artificial Intelligence”. He holds an Honorary Associate Professor position at UCL, was an early researcher at DeepMind, went on to become Director of Research at ElementAI and worked with institutions ranging from the NHS in Great-Britain to Amnesty International.

He is a strong advocate for using Artificial Intelligence and computer engineering tools for good and cautions us to think carefully about who we develop models and tools for. Ask the question: What could go wrong? How could this be misused? The list of projects where he used his computing skills for good is long and divers: With the NHS he developed methods to measure and diagnose eye diseases. For Amnesty International he helped quantify the abuse female journalists receive on Twitter, based on a database of tweets labeled by volunteers.

Beyond these applied projects, Julien and Alex muse about the future of structured models in times of more and more popular deep learning approaches and the fascinating potential of these new approaches. He advices anyone interested in these topics to be comfortable with experimenting by themselves and potentially breaking things in a non-consequential environment.

And don’t be too intimidated by more seasoned professionals, he adds, because they probably have imposter-syndrome themselves which is a sign of being aware of ones own limitations.

Automated Transcript

Please note that the following transcript was generated automatically and may therefore contain errors. Feel free to reach out if you’re willing to correct them.

Transcript

[00:00:00] This episode will show you different sides of the tech world, the one where you research and apply algorithms where you get super excited about image recognition and AI generated art, and the one where you support social change actors, AKA v AI could movement. My guess for this episode is quite naturally.

Ju Car Ju is an honorary associate professor at ucl. He was an early researcher at Deep Moines where he designed its early algorithms. He then worked as a director of research at Element ai, where he built and led the London office and the AI for Good Unit. After his theoretical work on base methods, he had the privilege to work with V NHS to diagnose eye diseases with Amnesty International, to quantify abuse and Twitter, and to find destroyed villages in Darfur.

We foreign seek architecture to identify tear [00:01:00] gas canisters used against civilians and with many other organiz. Other than that, JU is Nav reader and loves dark humor and picking up his son from school at the hour of daddies and mummies, as he says. This is learning based in Statistics. Episode 71, recorded September 6th, 2022.

Welcome to Learning Based in Statistics, a fortnightly podcast on basin in front Methods project. In the people who make it possible. I'm your host, Alex Andora. You can follow Twitter and Ann underscore andora like the country. For any info about the podcast, learn base stats.com is lab to be show notes becoming corporate sponsor supporting LBS pat and unlocking based merge, everything is there.

That's learn based stats.com. If with all that info a model is still [00:02:00] resist you, or if you find my voice especially smooth and want me to come and teach based stats in your company, then reach out at Alex dot andora atmc labs dot I or book Core with me@learnbaseddance.com. Thanks a lot folks, and best patient wishes to you all.

Let me show you how to be a good, crazy and change your predictions after taking information. And if you thinking now be less than amazing, let's adjust those. Expect. Patient what Abassian is someone who cares about evidence and doesn't jump to assumptions based on intuitions and prejudice. Abassian makes predictions on the best available info and adjusts the probability cuz every belief is provisional.

And when I kick the flow, mostly I'm watching eyes widen. Maybe cuz my likeness lowers expectations of tight reman. How would I know unless I'm Ryman in front of a bunch of blind men dropping placebo controlled science like I'm Richard Feinman. Hello my dear Bence. Yep, [00:03:00] it's my favorite time again, the one where I get to thank my brand new supporters and Pat.

So very officially, thank you so much. Or Duet P Dusek and Paul Cox for joining the full posterior and good beers you made my day and. Make sure to send a picture when you get your exclusive lbs mer. Okay, now let's dive into the science of our generation with Julian Car Bees,

Julian Car Bees. The Learning Beijing Statistics. I said

so apparently it's French. I'm, I'm sorry guys. Hi everyone. Thanks for having me here today. , it was you. Yeah, Julian, uh, thank you. Thank you so much for, for taking the time. Very happy to have you on the show. I have so many questions for you. I'm the feeling I'm gonna learn a lot. So let's [00:04:00] dive in as usual with your origin story.

How did you come to the math and stats worlds? Oh boy. Well, let, let me try and leave up to the, to, to the high bar you've set up. Basically, I trained originally as a computer science engineer, a coder. I've been coding since I'm 12, Uh, was into assembly x a six, removing shareware limitations back in high school.

So I decided I wanted to go to go that way. I went to an engineering school in France, specialized in computer science, went to algorithm competitions, the a m Icpc, loved coding, loved formalizing problems, uh, problem solving. But then I realized, hey, this is all really fun. But the world is really noisy and stochastic.

I'm gonna need more math and probabilities and statistics if I want to apply all these really cool algorithms in the real world. So I decided to, in parallel, I'll do a master in mathematical statistics, which required speeding up on, speeding up on the math a lot that [00:05:00] the year I started coffee and then followed with a PhD that actually merged my great mix side and mathematical side.

So computational statistics in sequential Monte. And that's, uh, that's how I was, uh, I was in the field. So not entirely straight, but not completely senior either. Yeah, you are on the, on the side of the people who started coding, like extremely early . Not as young as I would've wanted, but it, it was fun. I like coding and I keep thinking in of math and um, and statistics in mostly algorithmic viewpoint.

As a result. That's interesting. Now I think you are the second French person to come on the show. The first one was re and uh, he started very early, also his programming career. So I kinda feel like the black sheep here, like I started when I was 27, so, you know, Well actually congrats on you. You know, it's much harder to do it that way than when you're just run in really young.

So, yeah, well done. . [00:06:00] Yeah, thankfully it's not like, uh, you know, dancing. You can still be a professional programmer if you start at 27, whereas if you wanna be a professional dancer cannot start at 27. Although you're ruining my hopes now you're breaking my heart exon. Cool. So that's math, et cetera. Uh, for, for patient methods in particular, do you remember how you first got introduced to them?

Yeah, actually it was during university where I got involved in a project with a, with a professor There. We were working on. Instant coffee and detecting fraud in instant coffee. It was a joint project with, uh, Nestle food manufacturer and they were trying to establish norms to prevent, well, to detect when instant coffee has been unreached with extra par or extra sugar.

So we had to do mixture of oceans on the population of T population where you measure the glucose and the [00:07:00] xylos content in, um, of the shelf store out instead coffee and try to figure out, oh, where is the main mode, which we gather are the ONS producers, and where are the other modes and where do you put the limits to quantify what should be labeled as coffee and what, what shouldn't?

So it was really fun actually to go and take, you know, just a random, random, usual object, put it through. And discover a lot about the, the world behind it of trying to dro coffee, of trying to, and all that through, you know, a two dimensional vector and uh, and a bunch of oceans and, uh, patient analysis.

Was it in, uh, like at university or was it already one of your first, uh, jobs? No, no, it was in university as a side, a side project for fun with a teacher with whom I got along really well. I was like, Hey, do you want to collaborate on that? Sure. Bring it on. Well, that sounds like fun. And, and then, I'm skipping ahead here, but I mean, you've worked at Deep Mine for, uh, four years [00:08:00] and that was in the, in the two thousands, if I remember correctly.

Yeah. 2012 I joined, actually. Oh, okay. So 2010s, damn. We can already say the 2010s that they are done already. We're in the roaring twenties. Other crushing 10 twenties. Depends. Yeah, exactly. The sneezing, twenties if you will be smaller coffee . So yeah, it's basically you worked at deep mine for four years and in particular you were focused on health research.

I'm curious about that. Like can you tell us. Why and how Beijing stands were helpful here at Deep Mine. Yes. So I said four years at Deep Mine. I joined, we were 36 employees. I was a fierce researcher there. So we were really tiny startup and I said four years up to, you know, after the Google acquisition and when I left late 2016, we were 400 employees.

The first two years actually I was in the fundamental research team. I was trying to bring some be loft to conventional neural networks and to deep learning. Try to see if we could, you know, bring some, uh, some uncertainty [00:09:00] measurement there and be, uh, a little bit more statistical minded in their approach.

And then the. Google acquisition went through and, uh, well did mine. Founders and I spoke and, well, now we have the, the resource to go into healthcare, which is something that was close to my heart. I had done internships and consulting there in vaccine developments and, uh, experimental design and sequential experimental design in the, like, in parallel was my studies.

So now we have the resource to go into healthcare and that's where I transition a hundred percent to the applied part to create the deep mine health research team and mostly with the Veterans Affairs Deep mine, Uh, partnership with the Morph Fields I hospital and a few others. Wells helpful there.

There's quite a few parts. Obviously Deep Mine is a deep learning and reinforcement learning shop and patient deep learning. It was still a very nacent research field at the time. However, what did carry through, even when not using a formal invasion method, was the [00:10:00] focus of uncertainty. Focus on uncertainty.

Sorry. When you go into healthcare, you have to be. Extremely aware of the risks and the probabilities of things going well or going wrong. You want, you need to quantify that. You also need to think in terms of decision making. So some of the work we did, decision support to clinicians on enough technology, there's a whole metrics where you weigh the different risks and different type of risk.

And whether you got it, you know, having a false positive versus a false negatives or the different types of degrees of disease and how much worse one error is than another. And this actually ties back to the very roots of, you know, patient analysis, which is deeply rooted in, uh, in decision making. I love Jim Berger's book and the pleasure.

Once of, of working in Jim Berger's lab, in North Carolina and Jim Berger is book from 19 what, 89 Statistical Decision Theory, which is [00:11:00] extremely be in his thinking. I think it's probably be statistical decision theory if Jim hears it. I hope he forgives me for butchering the name of his book. So yeah, there, there's very much the same emphasis on thinking about the risk and about the, the cost of your decisions.

And that prevents throughout in healthcare. I can definitely guess that, that healthcare, um, requires a lot of uncertainty, estimation and also like probably decision making and cost functions are extremely important. Right? Like, because like if, if the cost is well, someone can die, of course it's way, way hard, like way higher cost than in most optimization problems that you can have.

Absolutely. And you also, even if you, you know, Clinician, when you train on data from clinicians, not all clinicians agree. So there is some disagreement even in the data, even in a supervised training problem, there is uncertainty in the very labels you're getting. How do you handle that? So there's, you know, there's different [00:12:00] ways and that's, for me, that's still the very patient way of thinking there.

Hmm. And I like to think, you know, rather than I did my PhD when the, you know, the war between patient versus frequencies were still, you know, boiling. I really prefer to think in terms of, hey, it's all thinking in terms of taking a probabilistic view. Of things. I did my PhD on patient stats, but in a way I was probably the most frequent test patient in that I was working on multicolors and conversion serums for multicolors, which is where you can force to be entirely, can afford to be entirely frequent test as you're doing centrally thes on, uh, the number of samples you simulate.

So I sometimes feel an imposter syndrome of oh, am I right patient? Uh, even though I was doing, during my PhD, I was actually doing frequentist thinking apply to patient algorithms. So yeah, really think about it in terms of probabilistic view on, on modeling, on statistics, on, you know, algorithms generally.

Yeah. And I'm curious now, The people you were [00:13:00] working with and who were not statisticians, how good of a grasp they had of the probabilistic way of thinking. And I, I'm curious how intuitive that was for them or how challenging maybe it was. It was quite fantastic actually. We were, you know, we're very lucky to work with clinicians.

Sorry, I mean, I'm, you, the gap, you related the hesitations, , sorry, P ski from, uh, University College London, who's a, a clinician, uh, with a deep, deep interest in machine learning. And, you know, he kept asking for more and more details and understanding more and more is, um, the algorithm that we were developing.

We're working very closely together. To show him every step of, Okay, here's what we're developing, here's what it knows, what it doesn't know. He was giving us constant feedback, Oh, that's really exciting. Ooh, that part, eh, maybe not that useful. Or, Ooh, we have this other type of data. Would that be helpful to you?

Or, Ooh, actually, you know, with [00:14:00] this kind of problem, be careful of this and this direction, or this and these issues with the data. That was really a privilege to work so closely with a, with a domain expert, and I think it was a really big factor in the success. Now to answer your question about how a family was the, with the pro ballistic view or the patient view, or even the machine lin view, Well, it's incredible how much he has into the topic.

He's actually the one who reached out to deep mine originally saying, Hey, I've seen your algorithms out there. Hmm. We have these problematics here and we have these anonymized data here. Could we do some research together? Is there something we could do there? So, yeah, you know, great, great, great Saviness.

And of course, you know, you wouldn't follow an equation, at least not when we started together. It's the same way. I wouldn't follow, you know, one of the that, you know, that he was giving there. Is intuition and the, the, the, the natural grasp of uncertainty that he sees every day in [00:15:00] diagnosis was, you know, Absolutely there.

So that was on the, on the part of the, like the clinicians. Mm-hmm. on your part. I'm wondering, during these whole, you know, experience and projects, which, like, was there a main difficulty that you encountered and what did you learn from it? Let's, as a positive side, I learned a lot . Cause there were many opportunities to learn and, and difficulties encountered.

And actually the, well, you know, I could dive into the quality of the data or the absolute, you know, the absolute imperative for entering proper anonymization. What this really drove home for me is the whole work that goes around the science into having an impact with your study. You know, back then we were working on this project back in 2015.

We knew that this new imagery, imagery [00:16:00] technique, which is called optical programs tomography, o c T for short, which is a 3D scan of your eye for very cheap imager, goes for 30,000 pound compared to the millions of pound that go into a regular MRI scanner, for example. So we knew this would be hitting the high street in every optician or every glass cell everywhere in the country, in the UK and in France.

Fast forward 2020, that's indeed the case we've got in burst stops the advertisement for the local optician, Specsavers, for example, the chain. Hey, come get your O C T there, getting a 3D image of the back of your eye rather than just a single photography. So we know this modality was coming. We had the algorithms ready and we developed the algorithms.

For to aid decision and to it early diagnostics and triage based on this extremely rich new modality of data. But we, and, and we've got, you know, nature paper for that. Hooray. As a researcher, I can tick that box a nature paper check, but the [00:17:00] reality is that this didn't make it to a product. These, all these algorithm that we have, are they not used in the optician feature suite in that we don't have the impact that we set up to have originally.

And for me, the lesson there really about thinking out about, okay, how, what is a organizational environment where you do your work and what within this organization and with the different players that are involved there. What is a path to actually having your algorithms used by people and who is in that path and what are the incentives there?

So it's almost, damn, it's almost as if politics and organizations were, uh, were important. Who knew? , It's really interesting because it's also, you know, the flip side is also that this is also how you get to be able to work on such projects is because organizations get created that allow to mobilize [00:18:00] talent and resources to get to work on that and Oh yeah.

Manage the visibilities that make people, clinicians come to you and work on this. That's the flip side there. And learning to navigate that has been a, a really, you know, a really big lesson for this kind from these projects and something, you know, I keep learning, learning every day for sure. And I mean that, that does resonate with, with my personal experience also, and.

and something that I, I was interested in when I, you know, read about your profile and, and why I felt it would be interesting to have you on the show, is that, so you, you had this focus on health, like AI for health, and then afterwards, after Deep mind you, you're still focused on AI for, for good, as we could say.

Like in, actually, I think you still work regularly with Nesty International. And so I was curious because these are fields where, you know, the association with statistics doesn't jump to mind. and even to the country, I, I could guess that [00:19:00] people not familiar with statistics and modeling, you know, could see that from, with a, a negative eye, you know?

Mm-hmm. like, oh, is like status gonna like, take the human factor out of everything, you know, Uh, things like that, that I'm sure you hear a lot as I do. And so can you tell us what that work looks like actually and how that's helpful? Well, absolutely. I mean, For the term AI for good. You know, we use that because it's what the United Nations use for this whole stream of work, but we've got to be very, very careful about it because when you say AI for good, you can quickly fall into tech, save your syndrome, tech.

So solution easy. Yeah. And also that means that, that if you don't work in that field, that means you're doing AI for bad . That's to be tweak, you know? Yeah. Well the other thing is that ai, machine learning, any technology, heck, even a stick is deeply gel use, you know, a stick, you can use it to lever a rock and help your pelle that's stuck under it.

Or you can, you know, hit your pal on the head with that, not your pulse so much more anymore. So there's this deep [00:20:00] gel use, and I knew that on my scale I can't, you know, I can't do much against, or there's not much as an individual scale that I can do against the negative views of, uh, of ai. What I, you know, beside making all the ethical reviews around what I do, but what I can do is make sure that.

These tools, whether they be statistics, whether they be machine learning, whether they be anything go to used, and to help the people who know the real problems. And these people, you find them well in hospitals, you found them at Atty International, you found them in NGOs. You know, they are those who really know what's going on and what needs to be solved.

So my, my work there is really how can I help those and what do they need? And in the case of, of amnesty, you know how statistics work with amnesty? Well, you know, amnesty. They're fantastic campaigners and they do really amazing qualitative work. They [00:21:00] go visit the victims of, uh, human rights views. They get their stories, they understand the issues, the different incentives.

What I provide, and me and others work with them, is try to provide the quantitative aspect. So one example that is probably the project of my career that I'm the proudest of is the Troll PET Project with mst, which was to quantify the amount of abuse against women on Twitter. And especially women journalists and politicians and I worked with, you know, human right experts who had documented, interviewed journalists and politicians for the abuse they were receiving.

And we set up a whole study based on crowdsourcing on getting tweets looked at by thousands of volunteers and labeling them and saying, Oh, well this is abusive in this specific sense that is defined for the study where this is not abusive, but still problematic. Where this one is fine is all these being defined terms to being [00:22:00] defined by sociologists working with international and as providing the statistical analysis behind that.

You know, how do you, how do you even sample that? How do you analyze the result of a crowd? What are the the, the ways you've gotta be careful? What are the results and how much can you trust these results and adding these numbers behind. Behind the story helped characterize the type of abuse they were getting, who it was targeted at.

There's a really prominent journalist who then emailed the organizer at, he said, Oh, thank you. I mean, sometimes I don't, you know, there's obvious violence, direct violence threat, but there's this ES on Twitter, and now with the results, I understand why, because this level of problematic, aggressive content, but that are not abusive in the strictest sense of the world, as in that don't violate the abusive definition in the Twitter terms of surveys, but we see the barrage there.

All this quantization really helped, and actually when the report. That we released it, I think it was the 19th of December, 2018, when this report came out [00:23:00] from mst, uh, with my, with my team, which had both a qualitative and the quantitative aspect that got in many newspapers. And, uh, Twitter got renamed the Harvard Einstein of Social Media, and that was in the food beginning of the Me Too movement and that for effect to overnight.

Gets Twitter stock to crash by 5 billion, 16% loss in Twitter stock price. Which again, I'm not there to delete Twitter stock price. You know, that's not what we set up for. And they recovered within a few weeks. But it translated the problem of abuse online into a monetary unit that executives at Twitter and in other companies can really understand and really resonate with.

So this is for me, you know, a great personal success of having this tangible impact by, again, bringing the quantitative tools to activities and to the matter. The masters of qualitative thinking, the numbers in another project with the numbers [00:24:00] we, we counted again, we, we crowdsourcing. Uh, so, you know, I've just reused the same old, same old thinking, and, uh, yeah, we counted the number of.

Video surveil surveillance cameras, public video surveil surveillance cameras in New York City, and thanks again to the southerns of volunteers who labeled these images and these numbers. And the, the statistical analysis I did with a few others helped get, you know, the numbers that both went into visualization to make people realize how much surveillance you might be.

But also went into legal proceedings of amnesty against the New York Police Department, which in August this year, Amnesty won to force the New York Police Department to publish more about their civilians and their civilians capabilities. So again, it's just a matter of bringing stats to help the activists in this case or to other doctors.

In other case, it's very much the same thing there. Yeah, I love that because, um, like, I mean, I do think that these kind of analysis are gonna. [00:25:00] Done anyways. And so they might as well be done in a scientific way so that the numbers then mean something and can be used for further action. And that's something that's usually not very well done in the political realm.

So, um, always all for like more robust and serious statistical job getting percolated into the political science topics. And there's one thing that's really exciting when you, you know, you do your analysis in the most rigorous possible way in such an application in that you also publish the methodology and you're like, Oh.

Usually, you know, if I write a paper, an academic paper, or if I find a flow in it, mm, the reviewer will bash me or I look for a fool. Well, here actually is, mm, if I, if there is a flow in it, the whole, you know, the whole campaign can be derailed in that if, you know, the N Y P D or Twitter is able to say, Oh, look, and NT has made a fool for themselves.

They've grossly over inflated this or that [00:26:00] number, or that would not look good. So it gives you extra motivation to be extremely rigorous and publish your methodology in full detail and make sure you cross and at every, you know, tsri like you would do on in other studies, but there, you know exactly why you're doing it.

Actually. Now that you, you talked a bit about, um, About AI topics and things like that. Very recent topic that you just told me about before we record, and I'd like to actually talk about it now because I think it's, it's actually interesting, especially in, in relation to, uh, to basin stats. So there is this new model called Stable Diffusion that you just like told me about before we started the show.

So can you introduce listeners to what that is and why it was such a wow factor for you? Yeah. I was speechless for three hours this morning. I was going through blog posts after blog posts of experimenting with a met on myself. So stable diffusion, I guess it's good we didn't record this morning, right?

Because Yeah, well, exactly , because a speechless [00:27:00] guest in my experience is not a very good guest. They would be useless, right? ? Well, that stable diffusion is a generative model for images. Enter your text, describe what you wanted to generate, what you wanted to draw, and it draws it in very, you know, types of drawing that you might ask, including photo realistic.

The amount, the precision of the drawings, the amount, the amount of understanding of the, the, the language you use is just mind blowing. What's even more important is that so be, Well actually yeah, let's dwell a little on the technical process there. These are technology that even, you know, two years ago where as you know, there's absolutely no way this can ever be done.

You know, when I joined DeepMind originally in 2012, we are trying to predict what was the action to take in a at every video game. So we were doing essentially logistic regression on 30,000 input variables, representing the peak sales in an image. And even there I was, Well, [00:28:00] 30,000, we're pushing it, and that's just to analyze the image.

Now we're actually generating full resolution images on completely abstract topics. Uh, you can ask it to be, you know, you can be extremely descriptive and say, Oh, I want a city in the sky with a beautiful lit stare sky in the cities, most of these buildings. And you can be extremely descriptive like that.

Or as I, as I experimented with it, you can just ask something much more abstract like, Imagine the most horrific fear inducing picture for a human, and it generated an actually really scary face of some health. Zombie health, uh, well, whatever. And all this is not in realistic, you know, in realistic ways.

Back in 2013 with some colleagues, EDA we're trying to generate faces. Then we were getting a mo of success on 20 pixel by 20 gray scale sort of faces. And now we have this fully generating not just of faces as has been [00:29:00] seen in the GaN generative adversarial networks a couple years ago, but now a full scenes of.

Absolutely any type of description. And I must say this is not something I would know how to generate with a hierarchical patient model. There is a connection in that oftentimes in in patient studies, you know, we will write generative models. I will take a generative approach, describe the mechanism by which a model that, of the mechanism by which the data might arise.

So we are creating new observations now. This is it, but without such an explicit model. But just with gigantic, extremely deep neural networks. We're talking, you know, several billions of parameters. So it's not your 20 dimension. I'm starting to be in high dimension, important sampling type studies like I used to.

Years ago as we're talking inference in a 20 billion parameter space, there was a lot of optimization at the core. Yeah. So how is that done? Like you looked a bit into that. Like I [00:30:00] started to, Can you tell the, the listeners like, like basically what's the difference with the classic patient models, for instance, that people are, are used to and like Yeah.

I'm basically what, how, how is net done? You said it's a deep neural network, something like that. So, and this is how we turn the podcast into a 20 hour course. Course brace yourself. Yeah. Well, in a nutshell, in deep learning, generally there is a. And move away from having an explicit model with explicit unknowns and more towards, here's a massive stack of operations match explorations done in our, and keep iterating within a few case, a bit of inducted bias, which is fancy for saying, Oh, we inject a little bit of what we know of the real world.

So in a convolutional neural network working with image, we inject a little bit of notion of locality, but not that much really. And then we just. To minimize a certain loss, let's say the L two loss between your [00:31:00] examples and your reproductions. And now that's exactly what you do. You put the text in the form of embeddings, you know, representation of vector, representation of your text.

Uh, that's on the input at the output. You've got the image and you try to just have this gigantic series of operations, which are parameterized by weights. We're talking billions of weights, and you run an optimizer in this billion dimensional space of parameters to find one that does a really good function.

Approximation. I'm oversimplifying and I'm sure my deploying colleagues will, uh, forgive me for that, but in a, in a nutshell that the idea, so think of it as a really, really deep logistic ion, but where you have so many more, you know, so many more layers of, of logistic ation on top of another. If I want to be other simplistic, where there, at this stage, they probably don't forgive me anymore, but, uh, that, that the idea now, In the term of stable diffusion, there is a tie with Beijing that they're based on deficient processes such as the one that inspires some of the [00:32:00] multicolor algorithms such as nausea and algorithms.

So they're similar mathematical concepts behind it. Now, I won't, don't want to go further cuz I haven't read as much in the paper cuz it was out just, you know, recently, a few days ago. But what's really important when we look at, you know, talking about the impact you can have, that this generative model, this stable diffusions, have been released in the entirety.

The whole trained model has been open sourced. Anyone can download the trend model, run the model for any purpose whatsoever, build a business out of it, run it on their laptop, generate whatever, beautiful art, whatever illustration for a magazine, whatever fake image that they want. You know, everything is there for the, for the doing.

Unlike previous generative models, which were A, not as good, and B, carefully controlled, such as open AI daily two, which was made in beta first for, for quite a few months to a few [00:33:00] select people, and only recently came out as a software as a service. So you can ask it to generate, but you do not get access to the weights here.

The whole thing is there for anyone to, to download and to use, which has meant that in six days people have already built, you know, Photoshop plugins based on that and new features in software based on that. Cuz there is also a way to do image to image. Do a very crude sketch of what you want, a description of what you would like it to look like, and boom, here you get it.

It is, it is really stunning. Some people are winning talent competitions like, you know, artistic state fair on digital art without knowing how to draw thanks to that. But also you've got illustrators who are starting to say, Hey, hold on. You use the Economist of the Atlantic. I believe you're just run a story with a generated image rather than commissioning a graphist, a graphic artist.

What about our jobs? So now we're in a place which. Well, you know, we, we, we, you know, they kept being saying that all the best dory technology is coming. Yes. But, you [00:34:00] know, humans will get the creative job instead and don't really worry about, uh, you know, it'll be a hard transition, but there's, there's room for the human there.

Well, now actually we are having creativity done by algorithms, and you could still argue, well, actually you have a human describing and doing the imagination of what they want to see and guiding the algorithm. So yeah, the center hypothesis, uh, still holds true, but I don't think many of us were imagining this to this extent.

And actually it gives us, Yeah. Sorry, I, I could ramble for hours about that. Please do stop me . No, that's interesting. And, and yeah, so first, thanks for giving that description of, of the model like that on the fly. And I think listeners will appreciate to hear more about like, yeah, this distinction between.

Basically patient hierarchical model, for instance, and a deep neural network as we're talking about here. If you guys also are interested in, I, how about neural networks done in the patient way? I interviewed, uh, Lisa Senova, I think it was episode [00:35:00] 21. Very good episode. We talked about also GPS because it turns out GPS and neural networks are related and um, it seems to be a lot of the universe in the.

Everything is a GP and, uh, yeah. So I, I, I'll put, um, a link to the, in, in the show notes, but, um, to this episode. Yeah. So thanks for that. And yeah, in general, like, I'm curious about what you think this, the, this changes. Like, you, you talked a bit about that, like from a societal standpoint, and I find that super interesting because with technology, I'm always, there is always, you know, a voice in the back of my head when I hear people making like huge, you know, predictions about what this will change.

Like, especially when it's like, well, it will, like, jobs will be automated, but uh, creative jobs are okay, blah, blah, blah. And like, yeah, we don't really know that. Like, it's just like the history of technology is, is very hard to predict. You know? It's a bit like a, like financial crisis, right? You know, there will be one, but you don't [00:36:00] know where and you don't know why.

Mm-hmm. , so Yeah, you talked about that a bit already. Also from a statistical perspective, I'm interested here in your like mathematical and statistical background, What do you think that could change for like the modeling perspective, like the modeling part? Do you think that means the structured models will become less important and it will be more that kind of very free models in a way?

Or do you think both will actually work together or that they actually answer different needs as a good patient? I'll tell you that I know that I don't know, my, my posterior is extremely, extremely vague, uh, posterior distribution, so I'll be careful about do doing, you know, big prediction. I can, I can speak about what I observe already, which is that deep learning requires at the moment a huge amount of.

You could imagine generating models for, pertains for even for observations [00:37:00] for coffee, uh, you know, level of Zillows and glucose to go back to our original discussion. But there's not that much data to train on. And we're talking here hundreds of millions of images to train on to get this kind of, uh, of models where, or more, I should check the numbers, so forgive me for the in, but we're in this order of magnitude, you know, maybe up to effect or 10.

So really, really, really large data set, which we don't have in other fields, that there is a huge amount of fields where we can't apply these methods. At the moment, there isn't a strong advocacy for trying to go towards more detail efficiency, and that can be done in several ways. Some which.

Incorporating more structure, known structure about the problem into the modeling phase. And well, one of the best ways to do that is precise is to higher models or explicit modeling. So in a way that's where there is an ample room to, to work is when you have much more tailored and, and smaller data ratio.

It's [00:38:00] also when you want to have a clear, you know, we were talking about safety and, and quantitation of uncertainty. If you want to have some measure of emerg uncertainty, you. It's at the moment still harder to do with neural nets if only because we don't know what are the failure modes. Deep learning is being very empirical about these ing beasts that we train.

I mean, I think this model took something like $600,000 to train, which is still rather small, surprisingly compared to other models. But, you know, $600,000 of compute is not something you can, Oh yeah, let's, let's simply 10 times to get a lot of different, a lot of different apples. I mean you can, but it depends on your bank account, I guess.

Yeah, exactly. . Exactly. Well if you were to tell me that I can, I'm gonna check my bank account real quick cuz you seem too nailing that. Don't . But yeah, and so there's a much more empirical approach. There's, the mathematical analysis is also much harder. You know, there's no convergent drms on, [00:39:00] on neural nets and.

You know, I used to be, Oh, that's not proper. Well, heck, I'm saying that works. We just can't quantify how much works and, you know, if it works, don't this, it, it does think that I would not know how to do, but that means we don't know how to quantify the limits of it and the risks we, we, we are taking with, with them.

So there will be some need for very critical thinking around these models. And I dunno, I'm sure some of your listeners are aware of, for example, the FICO around team need guru being fired from Google last year, Chris, for criticizing. And studying closely the limits of large language models of which stable diffusion is a kind of, um, a kind of offshoot.

So it, it is a very touchy topic and that's where we go back again to the deep link between our technology, what we do as statistician, and how it gets used and why it gets used for, and who uses it and [00:40:00] who owns it. And for me that that is really the key. And I'd rather do a model that's owned by international than owned by a large organization whose priorities can't change over time and whose incentives are not necessarily aligned with that of the general, unless a general common good.

And that's where it gets really, really important. And I believe that as scientists, as statisticians, that we have a responsibility. For our tools for how they're used. And we have a, a duty to think about, hey, what could go wrong? You know, if it's being used. And of course with what you go could go wrong.

You, you might never work or anything. So there's a balance there, but you, you have a duty to ask at least to sell the questions. Yeah. Super interesting. And so for listeners, I put into the show notes the blog post you sent me about stable diffusion that explains and goes into, into the details of the model and people can try and Yeah, Simon will send blog post is really eye opening and do follows the links in there [00:41:00] towards a lot of examples of the use of, of stable diffusion.

Uh, funny thing actually, we're seeing from a technical point of view, we're seeing the rise of, uh, prompt engineering. There's a whole field now. . In which words do you formulate your request so that your model gives you the most pretty output, which is something I would never have imagined. This is purely how, you know, what text do you give?

Yeah. Sounds like choosing your priors. Yeah. I mean, yes, but with even less math in it, so Okay. I'm like, why did I go through all these years of learning math to get there? But you know what? It works. So don't not it. It's quite remarkable. It's at the same time, it feels weird because we make models that we then have to learn how to use that.

Isn't that supposed to be the, the, the other way and like we build it, therefore we know how it works. Well, not quite anymore. I, I found that absolutely fascinating. If a little scary Yeah. and that makes me think that'd be awesome to choose your priors. Like imagine you could prompt, you know, and you, you could be [00:42:00] like, like instead of saying I want a normal distribution with new equals two and C Michael , which you understand, I understand, but non statisticians don't imagine you could like prompt business people and they could say, Well I want a distribution who looks like, you know, a bail curve and you know, usually it's around two and it can go up to that, like that number and then the computer could tell, Okay, so you want a normal distribution musicals to see Michael set automated prior citation.

Yeah, I love that. And that's something we are really curious about and we think a lot about it at p c labs and in the p c team in com in general, because that's something when you teach people and also when you, when you work with clients and listening, the products is always something that can be complicated and intimidating for beginners.

And so we're always trying to find easier ways. We have that new function now in, in m c, which [00:43:00] finds the priors based on your constraint. Mm-hmm. nice, but you still have to tell it the, the function you want. Like you tell it I want to gamma distribution with 95% of probability mass between that and that.

And then p c tells you, okay, so you want to gamma with alpha equals blah blah and beta equals another thing. But then like the next step would be that , I don't know, I want to gamma. And then you're gonna put all of his patient statisticians out of a job. Don't say so . Well you still, you'll still need to parametize the model afterwards.

So you know, like this, the model structure, you still need it. It's like that's the hard part. But then if you could like parameterize the priors like that, that'd be awesome because then you don't need to know, you want to gamma and in a way you don't really need to know you want to Gemma, right? Like you just need a function with the right constraints.

But where it's called gamma or anything. People don't drink gear. What I'd love to see is more automated prior sensitivity analysis. Oh yeah. Which, you know, we all say, Oh yeah, we should, you know, [00:44:00] we should do it. But it's really, Ted used to do, but now with, you know, frameworks like imc, like , like, you know, all the automatic differentiation being entirely automated.

I mean, couldn't we have this done automatically and see how much we depend on the, the shape of our priors and on the, you know, the pro environment that we have put there. And of course, how do you do that properly without double dipping in your data? But it in an i entire fashion is, Yeah. We have the tools for patient influenza have also benefited massively from the, the, the growth of, you know, neural nets for example in interesting.

Your, and that has led to development of toolboxes for tenor calculus that are then widely using IMC and PY and makes. More easy than ever to be a statistician modeler that you don't have to code in your, um, sampler anymore. You're actually much better off using one of the existing ones, which, which are getting extremely, extremely efficient.

Yeah, we continue talking about that, but time is running by. And I wanna get your, your thoughts about something that you're also very passionate about, [00:45:00] which is teaching. Mm-hmm. . I'm wondering, well, what motivates that, that passion first and also what are the most important skills that you are trying to instill in your students?

So, Regretably, I'm not teaching much these days anymore. My, uh, professorship, honorary professorship at UCLA is, uh, is more on the research side. I do a lot of mentoring though of, uh, young researchers and, uh, you know, through. Work with them at, through other different projects and, well, it's mostly a matter of paying it forward.

You know, I've been really lucky to find multiple professors along the way who introduced me to algorithms, who introduced me to, who believed in me, and spent way more time with me than they, you know, ever could have afforded. Really, I mean, if grateful and, and the best way, you know, I don't things them enough, but I can make sure that the next generation after benefits from the same, so that there's, there's a big matter there and it's just, I like to transmit the [00:46:00] passion.

I like to, you know, that it see seasons, the light up of, Oh yeah, that is a really nice trick. You know, the realization on that. That's something I, I, you know, I really enjoy. Yeah, and I definitely agree, like having passion professors. One of the best things that can happen to you. You know, like having that passion inside you really, really helps people.

Well get more passionate to put that and see that. Again, as I always say, science is done by people and it's inherently human. Contrary to like what? A lot of people tell you and think, so, Yeah. Yeah. Like, that's awesome. And so talking with people Yeah. I was asking you like, what are, what are the main skills that you are trying to instill in your students ability to go and fetch information from multiple different source, multiple different angles, really quickly get to the core of it.

I mean, I always remember when my, my PhD advisor just took a paper and scanned through it in 30 seconds [00:47:00] to get the point because he was just jumping from equation to equation and ignoring the text around and, well, that's in light. Like, wow, what happened there? But it's mostly because, you know, he had read tons of papers in the domains and knew how to read really quickly in that sense, in the sense of parsing the content, extracting the information really quickly.

And when you get to do. And you know, not being intimidated by X or white paper, or not being over focused on one single paper or one, you know, go cross reference the document, you know the information from different angles and that is how you get, you know, you get to really learn by yourself Then. Uh, you know, I can, as a teacher, I can show you that some things exist.

I introduce you to a field, but you know, up to you to run into it and go completely wild exploring it. And if you know how to quickly scan for information quickly, fine. Then you, you'll run all the fastest and explore the deeper. For sure. That, that would've impressed me too. Like going through a paper in 30 seconds,

Yeah, I was blown away. [00:48:00] But yeah, it's, and then funnily enough, just a couple years ago, someone sent me a paper and I was like, Okay, yeah. As quickly as can read, and I was like, Yeah, no, it's, it's worth digging into it deeper. How did you assess a paper in 30 seconds? No, I didn't read it entirely in 30 seconds, nor did I understand it in 30 seconds.

But by then, you know, you know enough to realize, okay, this good quality, does it have the different markings of, okay, it looks like someone who knows what they're doing or digging into it and it's when. Uh, she told me, Well, how I'd do that. I remember that professor. And it's just, you know, it's just a skill that you acquire really quickly, even during a PhD or when you're, you know, you just, just go and dive and scan, learn to scan, see enough of them and come pretty quickly.

If we go a bit more practical, like do you have any practical advice for like, on what to do and what not to do for people who would like to start a carrier in, in this field of ai, machine learning or, or vision stance? I would go for, well there's many. I don't want to sound like an old fart, and honestly the field is very different now than [00:49:00] when I started, you know, when I studied machine learning conferences, where 400 people, now we have 20,000 attendees and a lottery to be able to get in.

My advice might be somewhat old school in that sense. I think generally I find a lot, I find it extremely good for people who like to experiment by themself a lot. It's something that you can do in any computational angle. You know, you try to see if it crashes. That is much harder to do in say, theory improving where?

Well, there's, well actually now there are some ways to verify, automatically improve, but they're not really easy reach, so you don't see your algorithm crashing their life. So a lot of experimenting by yourself. Or with friends, but you know, actually coding it. A lot of serendipity in that field. Another thing is that with statistics and with machine learning, there are so many different places to apply it that you will be bound to find places if you let the serendipity happen.

I mean, how did I get to work with [00:50:00] amnesty? Hey, a friend told me, Hey, I hear you're living deep mind and that you wanted to work with NGOs. There's a meet up on Tuesday happening with some NGO that's presenting. You should maybe go have a look. And wow, there was Mil Murray from Amnesty Head of the DE project there.

They crowdsourcing was presenting what they do, and I realized, oh, you've got data. You have an expert that has trained a crowd to do a task. Maybe the crowd can train a neural net. Maybe I can help. I've got time and skills. I said, Just want to chat with her at the end. Serendipity plays a huge role. Now I say that, but I realize that I'm extremely privileged to, Well, I live in a big city.

I'm a white male in my thirties. I had time at that moment. You know, all conditions that make my advice. Possibly not as general as I wish it could be. But yeah, the, the course, you know, try these methods by yourself. Don't hesitate to go and make it crash and get information, you know, search for information everywhere.

Just read a lot. Play a lot with it. That would be, uh, and don't hesitate to ask. I mean, [00:51:00] the other thing, well actually, you know, here's, here's one key advice actually. Is, don't be impressed. Remember that the person in front of you who seems like a non professor who knows a lot of stuff, or a super expert actually probably deep down feels a very strong dose of imposter syndrome.

Or if they, if they don't, it might be because they put it this way, some having imposter syndrome for me is a markup. Someone who actually knows what they're doing because that also means that they know that there's a lot they don't know. And so, yeah, realize that the person you, you're seeing speaking, they're speaking on their topic of choice and they're a precise expert, but when they hear you speaking about what you do, they're probably equally lost or they were when you know they were at your stage.

So don't hesitate to go and speak and find the people. Some will, you know, some will brush you off, some will actually really want to share their knowledge and will remember being in your shoes. And yeah, go for that. I completely agree. And that does really resonates with me and that's actually an advice that I give all the time to people.

That's what I did. And I think [00:52:00] it's very, Both very revision and also very rewarding. The one thing I would say though is that if you are, um, an airplane pilot aspiring to be an airplane pilot, just go for it and don't be afraid to make it crash is not a good career advice. Please do not do that well, and that's why, you know, you experiment in, you know, this same way, there's all this discussion again about social impact, around move fast and break things, which that's why you want to make sure where you apply it Exactly.

So know that you're in training, pick the place where you, where you crash. Do that on the simulator . Okay. And so, yeah, actually I'm curious, This is a, a topic i, I really love talking about and about science communication. And how to com, like how to help the general public understand more about, um, scientific methods and seems like you do that a lot.

So I'm curious from your experience, what do you think are the best ways to communicate [00:53:00] about science and scientific reasoning? Well, I have no clue what the best way is. There are masterful, you know, presenters of, of science out there. I'm thinking David Sridhar, for example, an epidemiologist in the UK during, during covid made a masterful example of communicating.

You look at David Spi Halter, a big patient decision himself with, well he's professor of the understanding of is amazing, is just, and you know, is one of the best scientific communicators I've ever met. So I dunno that I have the best way. What I do have is seen a few things that work when I presented them, which is first the passion is really about that.

Be passionate. Tell a story that you feel strongly about. Don't, you know, I used to do slides with a lot of math in it and a lot of equations and you know, as a junior I felt like, oh, I need to impress, I need to really impress people with all my questions too, so that they can see that I'm, [00:54:00] that I know what I'm doing.

Well actually, now these days, my slides are mostly pictures at this stage because I'm trying to con to get their attention, to get the audience attention, to make them want to follow what is being, what is being said, to feel committed to, to what I'm trying to explain. When you think about it, the measure of uncertainty, the measure of risks, these are deeply fascinating topics and things that everyone, you know, realizes you cross the street.

Well, if it's a busy street, I'm gonna check twice before crossing. If, you know, it's pretty quiet, I'm like, I'm not gonna check. You know, my, So this is something people can really relate to. We, we, we all do that. Deeply if I present it in the, in the way of math, I need the math to make sure. Of what I'm doing.

I need the math to build it, but to communicate it, I almost need to, you know, if I tell no, there's a lot of math there, they're gonna be scared where, no, there's a lot. The math, formulates, formalizes, intuition, some of the best even scientific tools I went through [00:55:00] had this professor made you go through this really intricate formulas as if there were.

And you went through the talk with him, you're like, Yeah, I followed every step. Well, you, you went to him, you realized, Well, I think he skipped a few that I really can figure out, but at the moment it was, it was really clear. So, and when it's for, for general audience, don't, don't be down because people you know know if you're, if you're bing them, you, you not.

Carrying the, the, the real intent, but share the passion and share the story. Storytelling is extremely important there, and that's what you want to get through. And if someone wants a mathematical detail and is curious to know more, you know, say open the door at the end, say, Yeah, really always happy to discuss about the, the more technical details and point you in the right directions.

But, You, you need that story first and foremost. And for that, you know, capture the audience by being passionate and they'll forgive any blunder you might make or an imperfection in your slide one. No, you know, go for the passion. Completely agree. And uh, that's something I, I talked about already [00:56:00] quite a lot on the podcast.

I'm not gonna reiterate, but yeah, like for people, episode 50 was with David Big Harter. So I put that in the show notes if you wanna listen or release to it. Also, in episode 67, David keeping with here, he has a fantastic YouTube channel about astrophysics and cosmology and probability of life in the universe, things like that.

And the episode was, was really awesome. So I, I put that in the show notes also because we talked about exactly that kind of things. Like basically telling a story about science and making sure that people know that it's, it's a human story and it's not a, something that. Comes out of nowhere with just dry equations.

Okay. So before Colin to show, I am asking you the, the last two questions that ask everybody at the end of the show. I would actually pick your brain about one last question, which is, what do you think are the [00:57:00] biggest hurdles right now in the basin workflow, especially in your field, but like what do you think is, Yeah.

Something that could be done differently and probably better in the basin work for the ba? I guess it depends where you apply it. You know, if you apply to say a lot of which models there, well there's a challenge of how do you make base work at such scales, which is, you know, pretty tricky. And again, there's a very strong field of patient deep learning that there, and that has many different ideas.

And some of these ideas, I mean it was a really great paper by, uh, Bella Lab, Mac Krisna, I'm sorry, Bella. Bella, I'm Buting your last name. For example, looking at different ways to do en sampling and getting a measure of uncertainty without being patient at all. By purely, you know, sampling, training your neural nets different times under certain assumptions on the loss function.

It has to be a perverse scoring fraction and performing really well. And it's very simple and it says, Oh, we're getting uncertainty ways outpatient being patient. I [00:58:00] don't care. I want the uncertainty, whether we call it patient or not, I want this probabilistic thinking right now, there's a massive hurdle because this kind of, of quantization of risk is, is not quite there.

And not done routinely. And, uh, so that, I think that is a big, a big hurdle there, to be honest. Then obviously there is a challenge that the community is also somewhat different in, in deep learning. There's a huge community that came from computer science. Many statistics department have, Okay, I'm, I'm not gonna make friends here, but I've missed the boat on machine learning and, uh, oh no, we're more econometrics or we're more, you know, classical st.

Or, and yes, because these are methods that work when you are in massive engineering domain, there is this gap between the departments and the backgrounds, which, if we can resolve it, is going to, to, to really help. I mean, at UCL we, we used to have the, the csml computer Science and machine learning grouping, which [00:59:00] involved both the computer science department, the statistics department, and the the Gatsby Neuroscience Unit, which is a fantastic machine learning lab Now, Sadly, the Csml is not super active these days, but I think was a fantastic way to, to bring these different notions together and.

With the nature of deploying, it's very easy to go and experiment and be empirical. It's harder to get the, the, the theory that goes with, cause we don't even know how to analyze these, uh, these behaviors. You know, Mcmc, okay. You pick up your copy of the main and 2D book, 1991 on the behavior of mark of chains, and then you pick the book back rubber, which applies this.

To mcmc and okay, you've got a, a solid understanding, not so much in deep learning. That's where there, there, there is a gap, which is a challenge for applying patient workflow in deep learning at the moment. Okay. Yeah, yeah, definitely. Interesting. And then can refer people to episode 68 with Kevin Murphy.

Kevin Murphy. Oh, yes. Oh, Kevin's book is awesome. But the only problem is that his book is now three volumes, uh, which is making [01:00:00] a lot of time to read, but I keep going back to it all the time. It's cool to show you, you've been already very generous with, with your term, Junior. It's been a real pleasure.

Before that. As usual, I'm gonna ask you the two questions. Ask, give yes to the show. So if you had unlimited time and resources, which problem would you try to solve? I, I, full spoiler, I'm really glad you sent me the question yesterday night, so I could not think because, uh, yeah, this, Well, I used to work at Deep Mine, you know, where the idea was solve intelligence and then once you solve that, you can solve everything else.

I've moved a little further away from that because. I see tech now less so as a solution, but to quote, can Toyama as a professor in what the use of tech for development and for for good tech is not a solution. It's just a magnifier of human, human forces. So what I would really want, the problem I would really try to solve with unlimited resources and immediate time would be how to get more empathy and to [01:01:00] get people to feel safer, to express empathy and encourage empathy and drive crowds toward being more empathetic to each other.

Because once we have that, we change the human forces and then, okay, all tech now can really, really help us rather than destroying us. And second question, if you could have dinner with any great scientific mind, dead, alive, or fictional, who would it be? I did not quite think about the fiction part first.

I don't know whether I would make it a dinner of, or just being able to follow them unseen while they work to see how they work and observe them. You know, kinda like a stalker. Well, yeah. That's question. I'll say a ghost. A ghost . That's, that's creepy for sure. Yeah. Let's don call the big scientist. No, I mean, really personal childhood era.

Leonard DaVinci. Oh yeah. To ask him, you know, How it failed to work across so many different fields and to have that freedom and to [01:02:00] carve that freedom. These days it's getting much harder because for an academic career, you need to be super specialized about something. Hey, even for an industrial career, you need to be the, what are you the go to guy for is the question you see in say, Google performance review for promotion or used to see when I there now, I really hate that.

I really, I love to work on multiple things, statistics, machine learnings, you know, I work in these pre-sale can, they can be full in so many different things. And the startup, I'm advising Shift Lab. We're trying to, you know, we're taking that, that sense too. We're hiring by the way. So hit me. But the real thing here is I'm lucky right now to have built a job for myself.

I can work on these many, many different fields. I want to see to see more of that. And I'd love to, you know, Speak with a scientist like DaVinci, who's been in and exploring in so many domains. Great choice. And you would probably have the dinner or there's talking happening in Florence, Rome. Yeah, that does sound super cool.

Yeah, that'd be nice. . [01:03:00] Okay. Well Juku, thanks a lot. That was, Thank you. This was really cool. I loved it because we were able to talk about a lot of different topics just as would've. So I think, I think we are pretty much on our way here, as usual I resources. Don't ask me to paint anything. Yeah, well you can ask, uh, stable diffusion.

There we go. Problem solved. By the way, just you'll see on Twitter, I just posted an image. I asked, uh, stable diffusion with the prompt, Alexon, Julia and Ironman recording a podcast in Space . So you twitterer you, I put the tweet in the show notes because that, I think it's worth seeing it. And so yeah, as usual, I put resources and into your website in the show notes for those who wanna dig deeper.

Thank you again ju, for taking the time and being on this show. Thank you for having me, and thanks for everyone to listening. This has been another episode of [01:04:00] Earning Patient St. Be sure to read, review, and subscribe to the show on your favorite PGA or chaser and visit learn based stats.com for more resources based on today's topics as well as access to more episodes that will help to reach true patients that of mind.

Let's learn base stats.com. Our theme music is Good. Patient by bba Ringman, MCRs, and mega, check out his awesome work@bbabringman.com. I'm your host, Alex Andora. You can follow me on Twitter at Alex Elders do the Country. You can support the show and unlock exclusive benefits by visiting patriot.com/learn based stats.

Thanks so much for listening and for your support. You're truly. Good BA and change your predictions after taking information. And if you're thinking they'll be less than the ba, let's adjust those expectations. Let me show you how to be a good change calculations after taking fresh data, those predictions that your [01:05:00] brain is made.

Let's get the mono solid foundation.

Transcript

Sign up for our newsletter!

The latest from Reverend Bayes directly in your inbox!

QUICK Links

Get in Touch