Research articles

search articles

My Pages

Stat

Members: 3645
Articles: 2'506'133
Articles rated: 2609

26 April 2024

» arxiv » 1901.8029

Article overview

Learning to Collaborate in Markov Decision Processes
Goran Radanovic ; Rati Devidze ; David Parkes ; Adish Singla ;
Date:	23 Jan 2019
Abstract:	We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting. We study the problem of designing a learning algorithm for the first agent (A1) that facilitates a successful collaboration even in cases when the second agent (A2) is adapting its policy in an unknown way. The key challenge in our setting is that the presence of the second agent leads to non-stationarity and non-obliviousness of rewards and transitions for the first agent. We design novel online learning algorithms for agent A1 whose regret decays as $O(T^{1-frac{3}{7} cdot alpha})$ with $T$ learning episodes provided that the magnitude of agent A2’s policy changes between any two consecutive episodes are upper bounded by $O(T^{-alpha})$. Here, the parameter $alpha$ is assumed to be strictly greater than $0$, and we show that this assumption is necessary provided that the {em learning parity with noise} problem is computationally hard. We show that sub-linear regret of agent A1 further implies near-optimality of the agents’ joint return for MDPs that manifest the properties of a {em smooth} game.
Source:	arXiv, 1901.8029
Services:	Forum \| Review \| PDF \| Favorites

No review found.

Did you like this article?

This article or document is ...
important:
of broad interest:
readable:
new:
correct:
Global appreciation:

Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.

browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)

ScienXe.org
» my Online CV
» Free

News, job offers and information for researchers and scientists:

home

contact

sitemap