Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3652
Articles: 2'545'386
Articles rated: 2609

24 June 2024
 
  » arxiv » 2302.00284

 Article overview



Selective Uncertainty Propagation in Offline RL
Sanath Kumar Krishnamurthy ; Tanmay Gangwani ; Sumeet Katariya ; Branislav Kveton ; Anshuka Rangi ;
Date 1 Feb 2023
AbstractWe study the finite-horizon offline reinforcement learning (RL) problem. Since actions at any state can affect next-state distributions, the related distributional shift challenges can make this problem far more statistically complex than offline policy learning for a finite sequence of stochastic contextual bandit environments. We formalize this insight by showing that the statistical hardness of offline RL instances can be measured by estimating the size of actions’ impact on next-state distributions. Furthermore, this estimated impact allows us to propagate just enough value function uncertainty from future steps to avoid model exploitation, enabling us to develop algorithms that improve upon traditional pessimistic approaches for offline RL on statistically simple instances. Our approach is supported by theory and simulations.
Source arXiv, 2302.00284
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.






ScienXe.org
» my Online CV
» Free

home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica