#78 Exploring MCMC Sampler Algorithms, with Matt D. Hoffman

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Matt Hoffman has already worked on many topics in his life – music information retrieval, speech enhancement, user behavior modeling, social network analysis, astronomy, you name it.

Obviously, picking questions for him was hard, so we ended up talking more or less freely — which is one of my favorite types of episodes, to be honest.

You’ll hear about the circumstances Matt would advise picking up Bayesian stats, generalized HMC, blocked samplers, why do the samplers he works on have food-based names, etc.

In case you don’t know him, Matt is a research scientist at Google. Before that, he did a postdoc in the Columbia Stats department, working with Andrew Gelman, and a Ph.D at Princeton, working with David Blei and Perry Cook.

Matt is probably best known for his work in approximate Bayesian inference algorithms, such as stochastic variational inference and the no-U-turn sampler, but he’s also worked on a wide range of applications, and contributed to software such as Stan and TensorFlow Probability.

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor, Thomas Wiecki, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, David Haas, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Trey Causey, Andreas Kröpelin, Raphaël R, Nicolas Rode and Gabriel Stechschulte.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag 😉

Links from the show:

Matt’s website: http://matthewdhoffman.com/
Matt on Google Scholar: https://scholar.google.com/citations?hl=en&user=IeHKeGYAAAAJ&view_op=list_works
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo: https://www.jmlr.org/papers/volume15/hoffman14a/hoffman14a.pdf
Tuning-Free Generalized Hamiltonian Monte Carlo: https://proceedings.mlr.press/v151/hoffman22a/hoffman22a.pdf
Nested R-hat: Assessing the convergence of Markov chain Monte Carlo when running many short chain: http://www.stat.columbia.edu/~gelman/research/unpublished/nestedRhat.pdf
Automatic Reparameterisation of Probabilistic Programs: http://proceedings.mlr.press/v119/gorinova20a/gorinova20a.pdf

Abstract

written by Christoph Bamberg

In this episode, Matt D. Hoffman, a Google research scientist discussed his work on probabilistic sampling algorithms with me. Matt has a background in music information retrieval, speech enhancement, user behavior modeling, social network analysis, and astronomy.

He came to machine learning (ML) and computer science through his interest in synthetic music and later took a Bayesian modeling class during his PhD.

He mostly works on algorithms, including Markov Chain Monte Carlo (MCMC) methods that can take advantage of hardware acceleration, believing that running many small chains in parallel is better for handling autocorrelation than running a few longer chains.

Matt is interested in Bayesian neural networks but is also skeptical about their use in practice.

He recently contributed to a generalised Hamilton Monte Carlo (HMC) sampler, and previously worked on an alternative to the No-U-Turn-Sampler (NUTS) called MEADS. We discuss the applications for these samplers and how they differ from one another.

In addition, Matt introduces an improved R-hat diagnostic tool, nested R-hat, that he and colleagues developed.

Automated Transcript

Please note that the following transcript was generated automatically and may therefore contain errors. Feel free to reach out if you’re willing to correct them.