Science-advisor
REGISTER info/FAQ
Login
username
password
     
forgot password?
register here
 
Research articles
  search articles
  reviews guidelines
  reviews
  articles index
My Pages
my alerts
  my messages
  my reviews
  my favorites
 
 
Stat
Members: 3645
Articles: 2'506'133
Articles rated: 2609

26 April 2024
 
  » arxiv » 1901.8029

 Article overview



Learning to Collaborate in Markov Decision Processes
Goran Radanovic ; Rati Devidze ; David Parkes ; Adish Singla ;
Date 23 Jan 2019
AbstractWe consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting. We study the problem of designing a learning algorithm for the first agent (A1) that facilitates a successful collaboration even in cases when the second agent (A2) is adapting its policy in an unknown way. The key challenge in our setting is that the presence of the second agent leads to non-stationarity and non-obliviousness of rewards and transitions for the first agent.
We design novel online learning algorithms for agent A1 whose regret decays as $O(T^{1-frac{3}{7} cdot alpha})$ with $T$ learning episodes provided that the magnitude of agent A2’s policy changes between any two consecutive episodes are upper bounded by $O(T^{-alpha})$. Here, the parameter $alpha$ is assumed to be strictly greater than $0$, and we show that this assumption is necessary provided that the {em learning parity with noise} problem is computationally hard. We show that sub-linear regret of agent A1 further implies near-optimality of the agents’ joint return for MDPs that manifest the properties of a {em smooth} game.
Source arXiv, 1901.8029
Services Forum | Review | PDF | Favorites   
 
Visitor rating: did you like this article? no 1   2   3   4   5   yes

No review found.
 Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

  Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.

browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)






ScienXe.org
» my Online CV
» Free


News, job offers and information for researchers and scientists:
home  |  contact  |  terms of use  |  sitemap
Copyright © 2005-2024 - Scimetrica