| | |
| | |
Stat |
Members: 3667 Articles: 2'599'751 Articles rated: 2609
07 February 2025 |
|
| | | |
|
Article overview
| |
|
On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats | Matteo Cacciola
; Antonio Frangioni
; Masoud Asgharian
; Alireza Ghaffari
; Vahid Partovi Nia
; | Date: |
4 Jan 2023 | Abstract: | Deep learning models are dominating almost all artificial intelligence tasks
such as vision, text, and speech processing. Stochastic Gradient Descent (SGD)
is the main tool for training such models, where the computations are usually
performed in single-precision floating-point number format. The convergence of
single-precision SGD is normally aligned with the theoretical results of real
numbers since they exhibit negligible error. However, the numerical error
increases when the computations are performed in low-precision number formats.
This provides compelling reasons to study the SGD convergence adapted for
low-precision computations. We present both deterministic and stochastic
analysis of the SGD algorithm, obtaining bounds that show the effect of number
format. Such bounds can provide guidelines as to how SGD convergence is
affected when constraints render the possibility of performing high-precision
computations remote. | Source: | arXiv, 2301.01651 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|