Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3657
Articles: 2'599'751
Articles rated: 2609

15 October 2024
 
  » arxiv » 2206.00177

 Article overview



On Gap-dependent Bounds for Offline Reinforcement Learning
Xinqi Wang ; Qiwen Cui ; Simon S. Du ;
Date 1 Jun 2022
AbstractThis paper presents a systematic study on gap-dependent sample complexity in offline reinforcement learning. Prior work showed when the density ratio between an optimal policy and the behavior policy is upper bounded (the optimal policy coverage assumption), then the agent can achieve an $Oleft(frac{1}{epsilon^2} ight)$ rate, which is also minimax optimal. We show under the optimal policy coverage assumption, the rate can be improved to $Oleft(frac{1}{epsilon} ight)$ when there is a positive sub-optimality gap in the optimal $Q$-function. Furthermore, we show when the visitation probabilities of the behavior policy are uniformly lower bounded for states where an optimal policy’s visitation probabilities are positive (the uniform optimal policy coverage assumption), the sample complexity of identifying an optimal policy is independent of $frac{1}{epsilon}$. Lastly, we present nearly-matching lower bounds to complement our gap-dependent upper bounds.
Source arXiv, 2206.00177
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.






ScienXe.org
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica