| | |
| | |
Stat |
Members: 3645 Articles: 2'504'928 Articles rated: 2609
25 April 2024 |
|
| | | |
|
Article overview
| |
|
Smaller generalization error derived for deep compared to shallow residual neural networks | Aku Kammonen
; Jonas Kiessling
; Petr Plecháč
; Mattias Sandberg
; Anders Szepessy
; Raúl Tempone
; | Date: |
5 Oct 2020 | Abstract: | Estimates of the generalization error are proved for a residual neural
network with $L$ random Fourier features layers
$ar z_{ell+1}=ar z_ell + ext{Re}sum_{k=1}^Kar b_{ell k}e^{{
m
i}omega_{ell k}ar z_ell}+ ext{Re}sum_{k=1}^Kar c_{ell k}e^{{
m
i}omega’_{ell k}cdot x}$. An optimal distribution for the frequencies
$(omega_{ell k},omega’_{ell k})$ of the random Fourier features $e^{{
m
i}omega_{ell k}ar z_ell}$ and $e^{{
m i}omega’_{ell k}cdot x}$ is
derived. The derivation is based on the corresponding generalization error to
approximate function values $f(x)$. The generalization error turns out to be
smaller than the estimate ${|hat f|^2_{L^1(mathbb{R}^d)}}/{(LK)}$ of the
generalization error for random Fourier features with one hidden layer and the
same total number of nodes $LK$, in the case the $L^infty$-norm of $f$ is much
less than the $L^1$-norm of its Fourier transform $hat f$. This understanding
of an optimal distribution for random features is used to construct a new
training method for a deep residual network that shows promising results. | Source: | arXiv, 2010.01887 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |