| | |
| | |
Stat |
Members: 3662 Articles: 2'599'751 Articles rated: 2609
13 December 2024 |
|
| | | |
|
Article overview
| |
|
Weight Prediction Boosts the Convergence of AdamW | Lei Guan
; | Date: |
1 Feb 2023 | Abstract: | In this paper, we introduce weight prediction into the AdamW optimizer to
boost its convergence when training the deep neural network (DNN) models. In
particular, ahead of each mini-batch training, we predict the future weights
according to the update rule of AdamW and then apply the predicted future
weights to do both forward pass and backward propagation. In this way, the
AdamW optimizer always utilizes the gradients w.r.t. the future weights instead
of current weights to update the DNN parameters, making the AdamW optimizer
achieve better convergence. Our proposal is simple and straightforward to
implement but effective in boosting the convergence of DNN training. We
performed extensive experimental evaluations on image classification and
language modeling tasks to verify the effectiveness of our proposal. The
experimental results validate that our proposal can boost the convergence of
AdamW and achieve better accuracy than AdamW when training the DNN models. | Source: | arXiv, 2302.00195 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|