Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3657
Articles: 2'599'751
Articles rated: 2609

14 October 2024
 
  » arxiv » 2206.00207

 Article overview



Statistical and Computational Complexities of BFGS Quasi-Newton Method for Generalized Linear Models
Qiujiang Jin ; Tongzheng Ren ; Nhat Ho ; Aryan Mokhtari ;
Date 1 Jun 2022
AbstractThe gradient descent (GD) method has been used widely to solve parameter estimation in generalized linear models (GLMs), a generalization of linear models when the link function can be non-linear. While GD has optimal statistical and computational complexities for estimating the true parameter under the high signal-to-noise ratio (SNR) regime of the GLMs, it has sub-optimal complexities when the SNR is low, namely, the iterates of GD require polynomial number of iterations to reach the final statistical radius. The slow convergence of GD for the low SNR case is mainly due to the local convexity of the least-square loss functions of the GLMs. Even though Newton’s method can be used to resolve the flat curvature of the loss functions, its computational cost is prohibitive in high-dimensional settings due to its per iteration cost of $mathcal{O}(d^3)$. To address the shortcomings of GD and Newton’s method, we propose to use Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton method to solve parameter estimation of the GLMs, which has a per iteration cost of $mathcal{O}(d^2)$. On the optimization side, when the SNR is low, we demonstrate that iterates of BFGS converge linearly to the optimal solution of the population least-square loss function, and the contraction coefficient of the BFGS algorithm is comparable to that of Newton’s method. On the statistical side, we prove that the iterates of BFGS reach the final statistical radius of the low SNR GLMs after a logarithmic number of iterations, which is much lower than the polynomial number of iterations of GD. We also present numerical experiments that match our theoretical findings.
Source arXiv, 2206.00207
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.






ScienXe.org
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica