Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3652
Articles: 2'545'386
Articles rated: 2609

24 June 2024
 
  » arxiv » 2302.00294

 Article overview



The geometry of hidden representations of large transformer models
Lucrezia Valeriani ; Diego Doimo ; Francesca Cuturello ; Alessandro Laio ; Alessio Ansuini ; Alberto Cazzaniga ;
Date 1 Feb 2023
AbstractLarge transformers are powerful architectures for self-supervised analysis of data of various nature, ranging from protein sequences to text to images. In these models, the data representation in the hidden layers live in the same space, and the semantic structure of the dataset emerges by a sequence of functionally identical transformations between one representation and the next. We here characterize the geometric and statistical properties of these representations, focusing on the evolution of such proprieties across the layers. By analyzing geometric properties such as the intrinsic dimension (ID) and the neighbor composition we find that the representations evolve in a strikingly similar manner in transformers trained on protein language tasks and image reconstruction tasks. In the first layers, the data manifold expands, becoming high-dimensional, and then it contracts significantly in the intermediate layers. In the last part of the model, the ID remains approximately constant or forms a second shallow peak. We show that the semantic complexity of the dataset emerges at the end of the first peak. This phenomenon can be observed across many models trained on diverse datasets. Based on these observations, we suggest using the ID profile as an unsupervised proxy to identify the layers which are more suitable for downstream learning tasks.
Source arXiv, 2302.00294
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.






ScienXe.org
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica