| | |
| | |
Stat |
Members: 3662 Articles: 2'599'751 Articles rated: 2609
13 December 2024 |
|
| | | |
|
Article overview
| |
|
Delayed Feedback in Kernel Bandits | Sattar Vakili
; Danyal Ahmed
; Alberto Bernacchia
; Ciara Pike-Burke
; | Date: |
1 Feb 2023 | Abstract: | Black box optimisation of an unknown function from expensive and noisy
evaluations is a ubiquitous problem in machine learning, academic research and
industrial production. An abstraction of the problem can be formulated as a
kernel based bandit problem (also known as Bayesian optimisation), where a
learner aims at optimising a kernelized function through sequential noisy
observations. The existing work predominantly assumes feedback is immediately
available; an assumption which fails in many real world situations, including
recommendation systems, clinical trials and hyperparameter tuning. We consider
a kernel bandit problem under stochastically delayed feedback, and propose an
algorithm with $ ilde{mathcal{O}}(sqrt{Gamma_k(T)T}+mathbb{E}[ au])$
regret, where $T$ is the number of time steps, $Gamma_k(T)$ is the maximum
information gain of the kernel with $T$ observations, and $ au$ is the delay
random variable. This represents a significant improvement over the state of
the art regret bound of
$ ilde{mathcal{O}}(Gamma_k(T)sqrt{T}+mathbb{E}[ au]Gamma_k(T))$ reported
in Verma et al. (2022). In particular, for very non-smooth kernels, the
information gain grows almost linearly in time, trivializing the existing
results. We also validate our theoretical results with simulations. | Source: | arXiv, 2302.00392 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|