ISL Colloquium

← List all talks ...

Kernel Thinning and Stein Thinning

Lester Mackey – Principal Researcher, Microsoft Research

Thu, 9-Mar-2023 / 4:00pm / Packard 202

The talk will only be streamed on Zoom. Sign up at https://stanford.zoom.us/meeting/register/tJckfuCurzkvEtKKOBvDCrPv3McapgP6HygJ to receive the Zoom link.

Abstract

This talk will introduce two new tools for summarizing a probability distribution more effectively than independent sampling or standard Markov chain Monte Carlo thinning:

  1. Given an initial n point summary (for example, from independent sampling or a Markov chain), kernel thinning finds a subset of only square-root n points with comparable worst-case integration error across a reproducing kernel Hilbert space.
  2. If the initial summary suffers from biases due to off-target sampling, tempering, or burn-in, Stein thinning simultaneously compresses the summary and improves the accuracy by correcting for these biases.

These tools are especially well-suited for tasks that incur substantial downstream computation costs per summary point like organ and tissue modeling in which each simulation consumes 1000s of CPU hours.

Based on joint work with Raaz Dwivedi, Marina Riabiz, Wilson Ye Chen, Jon Cockayne, Pawel Swietach, Steven A. Niederer, Chris. J. Oates, Abhishek Shetty, and Carles Domingo-Enrich:

  • Kernel Thinning (arXiv:2105.05842)
  • Optimal Thinning of MCMC Output (arXiv:2005.03952)
  • Generalized Kernel Thinning (arXiv:2110.01593)
  • Distribution Compression in Near-linear Time (arXiv:2111.07941)
  • Compress Then Test: Powerful Kernel Testing in Near-linear Time (arXiv:2301.05974)

Bio

Lester Mackey is a Principal Researcher at Microsoft Research, where he develops machine learning methods, models, and theory for large-scale learning tasks driven by applications from climate forecasting, healthcare, and the social good. Lester moved to Microsoft from Stanford University, where he was an assistant professor of Statistics and, by courtesy, of Computer Science. He earned his PhD in Computer Science and MA in Statistics from UC Berkeley and his BSE in Computer Science from Princeton University. He co-organized the second place team in the Netflix Prize competition for collaborative filtering; won the Prize4Life ALS disease progression prediction challenge; won prizes for temperature and precipitation forecasting in the yearlong real-time Subseasonal Climate Forecast Rodeo; received best paper, outstanding paper, and best student paper awards from the ACM Conference on Programming Language Design and Implementation, the Conference on Neural Information Processing Systems, and the International Conference on Machine Learning; and was elected to the COPSS Leadership Academy.