forgot password?
register here
Research articles
  search articles
  reviews guidelines
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
Members: 3652
Articles: 2'545'386
Articles rated: 2609

24 June 2024
  » arxiv » 2302.00378

 Article overview

An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning
Mohammad AkbarTajari ; Sara Rajaee ; Mohammad Taher Pilehvar ;
Date 1 Feb 2023
AbstractParameter-efficient fine-tuning approaches have recently garnered a lot of attention. Having considerably lower number of trainable weights, these methods can bring about scalability and computational effectiveness. In this paper, we look for optimal sub-networks and investigate the capability of different transformer modules in transferring knowledge from a pre-trained model to a downstream task. Our empirical results suggest that every transformer module in BERT can act as a winning ticket: fine-tuning each specific module while keeping the rest of the network frozen can lead to comparable performance to the full fine-tuning. Among different modules, LayerNorms exhibit the best capacity for knowledge transfer with limited trainable weights, to the extent that, with only 0.003% of all parameters in the layer-wise analysis, they show acceptable performance on various target tasks. On the reasons behind their effectiveness, we argue that their notable performance could be attributed to their high-magnitude weights compared to that of the other modules in the pre-trained BERT.
Source arXiv, 2302.00378
Services Forum | Review | PDF | Favorites   
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
of broad interest:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica