Rahul Mazumder | ISL Colloquium

Abstract

Modern machine learning models excel in prediction accuracy and model utility but are often difficult to interpret. Additionally, their large sizes lead to high storage, inference, and evaluation costs. First, using decision tree ensembles (e.g., random forests and gradient boosting) as an example, we discuss extracting simple tree substructures (rule collections) from large ensembles. Our estimator is defined as the solution to a large-scale discrete optimization problem (integer program) that can balance multiple considerations, including prediction accuracy, number of rules, rule depth, feature usage, and stability constraints. Second, we’ll explore how related optimization problems arise—and can be highly effective—in compressing large neural networks and LLMs, where the main goal is improving model efficiency. We will discuss different algorithms, some amenable to GPU acceleration, for addressing these large-scale math optimization tasks. Some of these algorithmic tools may be of independent interest in other areas of statistics, machine learning, and operations research.

Bio

Rahul Mazumder is the NTU Associate Professor of Operations Research and Statistics at MIT Sloan School of Management. He is affiliated with MIT Operations Research Center, MIT Center for Statistics and Data Science, Laboratory of Information and Decision Systems. His research interests are at the intersection of statistics, machine learning and mathematical programming (large-scale convex and mixed integer optimization), and their applications to industry, and the sciences. He is a recipient of the Leo Breiman Junior Award from the American Statistical Association, International Indian Statistical Association Early Career Award in Statistics and Data Science, INFORMS Donald P. Gaver, Jr. Early Career Award for Excellence in Operations Research, INFORMS Optimization Society Young Researchers Prize, Office of Naval Research Young Investigator Award, INFORMS ICS Prize (Honorable Mention). He is currently serving/recently served as AE of the Annals of Statistics, Operations Research, Journal of Machine Learning Research, Bernoulli. He is a series editor of the Cambridge University Press textbook series (Algorithms area).

Simplifying Complex Machine Learning Models with Mathematical Optimization

Abstract

Bio