| | |
| | |
Stat |
Members: 3645 Articles: 2'500'096 Articles rated: 2609
18 April 2024 |
|
| | | |
|
Article overview
| |
|
Be Aware of Non-Stationarity: Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits | Lingda Wang
; Huozhi Zhou
; Bingcong Li
; Lav R. Varshney
; Zhizhen Zhao
; | Date: |
12 Sep 2019 | Abstract: | Cascading bandit (CB) is a variant of both the multi-armed bandit (MAB) and
the cascade model (CM), where a learning agent aims to maximize the total
reward by recommending $K$ out of $L$ items to a user. We focus on a common
real-world scenario where the user’s preference can change in a
piecewise-stationary manner. Two efficient algorithms, exttt{GLRT-CascadeUCB}
and exttt{GLRT-CascadeKL-UCB}, are developed. The key idea behind the
proposed algorithms is incorporating an almost parameter-free change-point
detector, the Generalized Likelihood Ratio Test (GLRT), within classical upper
confidence bound (UCB) based algorithms. Gap-dependent regret upper bounds of
the proposed algorithms are derived and both match the lower bound
$Omega(sqrt{T})$ up to a poly-logarithmic factor $sqrt{log{T}}$ in the
number of time steps $T$. We also present numerical experiments on both
synthetic and real-world datasets to show that exttt{GLRT-CascadeUCB} and
exttt{GLRT-CascadeKL-UCB} outperform state-of-the-art algorithms in the
literature. | Source: | arXiv, 1909.5886 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser claudebot
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |