Speaker: Professor Babak Hassibi
(Caltech)

Title: Stochastic Descent Algorithms: Minimax Optimality, Implicit
Regularization, and Deep Networks

Abstract: Stochastic descent methods have had a long history in optimization,
adaptive filtering, and online learning and have recently gained tremendous
popularity as the workhorse for deep learning. So much so that, it is now
widely recognized that the success of deep networks is not only due to their
special deep architecture, but also due to the behavior of the stochastic
descent methods used, which plays a key role in reaching "good"
solutions that generalize well to unseen data. In an attempt to shed some light
on why this is the case, we revisit some minimax properties of stochastic
gradient descent (SGD)---originally developed for quadratic loss and linear
models in the context of H-infinity control in the 1990's---and extend them to
general stochastic mirror descent (SMD) algorithms for general loss functions
and nonlinear models. These minimax properties can be used to explain the
convergence and implicit-regularization of the algorithms when the linear
regression problem is over-parametrized (in what is now being called the
"interpolating regime"). In the nonlinear setting, exemplified by
training a deep neural network, we show that when the setup is "highly
over-parametrized", stochastic descent methods enjoy similar convergence
and implicit-regularization properties. This observation gives some insight
into why deep networks exhibit such powerful generalization abilities. It is
also a further example of what is increasingly referred to as the
"blessing of dimensionality".

Bio: Babak Hassibi is the inaugural Mose and Lillian S. Bohn Professor of Electrical
Engineering at the California Institute of Technology, where he has been since
2001, From 2011 to 2016 he was the Gordon M Binder/Amgen Professor of
Electrical Engineering and during 2008-2015 he was Executive Officer of
Electrical Engineering, as well as Associate Director of Information Science
and Technology. Prior to Caltech, he was a Member of the Technical Staff in the
Mathematical Sciences Research Center at Bell Laboratories, Murray Hill, NJ. He
obtained his PhD degree from Stanford University in 1996 and his BS degree from
the University of Tehran in 1989. His research interests span various aspects
of information theory, communications, signal processing, control, and machine
learning. He is an ISI highly cited author in Computer Science and, among other
awards, is the recipient of the US Presidential Early Career Award for
Scientists and Engineers (PECASE) and the David and Lucille Packard Fellowship
in Science and Engineering. He is General co-Chair of the 2020 IEEE
International Symposium on Information Theory (ISIT 2020).