| | |
| | |
Stat |
Members: 3645 Articles: 2'506'133 Articles rated: 2609
27 April 2024 |
|
| | | |
|
Article overview
| |
|
Early Stopping in Deep Networks: Double Descent and How to Eliminate it | Reinhard Heckel
; Fatih Furkan Yilmaz
; | Date: |
20 Jul 2020 | Abstract: | Over-parameterized models, in particular deep networks, often exhibit a
double descent phenomenon, where as a function of model size, error first
decreases, increases, and decreases at last. This intriguing double descent
behavior also occurs as a function of training epochs, and has been conjectured
to arise because training epochs control the model complexity. In this paper,
we show that such epoch-wise double descent arises for a different reason: It
is caused by a superposition of two or more bias-variance tradeoffs that arise
because different parts of the network are learned at different times, and
eliminating this by proper scaling of stepsizes can significantly improve the
early stopping performance. We show this analytically for i) linear regression,
where differently scaled features give rise to a superposition of bias-variance
tradeoffs, and for ii) a two-layer neural network, where the first and second
layers each govern a bias-variance tradeoff. Inspired by this theory, we study
a five-layer convolutional network empirically and show that eliminating
epoch-wise double descent through adjusting stepsizes of different layers
improves the early stopping performance significantly. | Source: | arXiv, 2007.10099 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |