| | |
| | |
Stat |
Members: 3643 Articles: 2'487'895 Articles rated: 2609
28 March 2024 |
|
| | | |
|
Article overview
| |
|
Connecting the Dots Between MLE and RL for Sequence Generation | Bowen Tan
; Zhiting Hu
; Zichao Yang
; Ruslan Salakhutdinov
; Eric Xing
; | Date: |
24 Nov 2018 | Abstract: | Sequence generation models such as recurrent networks can be trained with a
diverse set of learning algorithms. For example, maximum likelihood learning is
simple and efficient, yet suffers from the exposure bias problem. Reinforcement
learning like policy gradient addresses the problem but can have prohibitively
poor exploration efficiency. A variety of other algorithms such as RAML, SPG,
and data noising, have also been developed from different perspectives. This
paper establishes a formal connection between these algorithms. We present a
generalized entropy regularized policy optimization formulation, and show that
the apparently divergent algorithms can all be reformulated as special
instances of the framework, with the only difference being the configurations
of reward function and a couple of hyperparameters. The unified interpretation
offers a systematic view of the varying properties of exploration and learning
efficiency. Besides, based on the framework, we present a new algorithm that
dynamically interpolates among the existing algorithms for improved learning.
Experiments on machine translation and text summarization demonstrate the
superiority of the proposed algorithm. | Source: | arXiv, 1811.9740 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser claudebot
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |