Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3667
Articles: 2'599'751
Articles rated: 2609

07 February 2025
 
  » arxiv » 2301.01651

 Article overview



On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats
Matteo Cacciola ; Antonio Frangioni ; Masoud Asgharian ; Alireza Ghaffari ; Vahid Partovi Nia ;
Date 4 Jan 2023
AbstractDeep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing. Stochastic Gradient Descent (SGD) is the main tool for training such models, where the computations are usually performed in single-precision floating-point number format. The convergence of single-precision SGD is normally aligned with the theoretical results of real numbers since they exhibit negligible error. However, the numerical error increases when the computations are performed in low-precision number formats. This provides compelling reasons to study the SGD convergence adapted for low-precision computations. We present both deterministic and stochastic analysis of the SGD algorithm, obtaining bounds that show the effect of number format. Such bounds can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.
Source arXiv, 2301.01651
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.






ScienXe.org
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2025 - Scimetrica