  
  
Stat 
Members: 3657 Articles: 2'599'751 Articles rated: 2609
09 October 2024 

   

Article overview
 

Provably Efficient Offline Multiagent Reinforcement Learning via Strategywise Bonus  Qiwen Cui
; Simon S. Du
;  Date: 
1 Jun 2022  Abstract:  This paper considers offline multiagent reinforcement learning. We propose
the strategywise concentration principle which directly builds a confidence
interval for the joint strategy, in contrast to the pointwise concentration
principle that builds a confidence interval for each point in the joint action
space. For twoplayer zerosum Markov games, by exploiting the convexity of the
strategywise bonus, we propose a computationally efficient algorithm whose
sample complexity enjoys a better dependency on the number of actions than the
prior methods based on the pointwise bonus. Furthermore, for offline
multiagent generalsum Markov games, based on the strategywise bonus and a
novel surrogate function, we give the first algorithm whose sample complexity
only scales $sum_{i=1}^mA_i$ where $A_i$ is the action size of the $i$th
player and $m$ is the number of players. In sharp contrast, the sample
complexity of methods based on the pointwise bonus would scale with the size
of the joint action space $Pi_{i=1}^m A_i$ due to the curse of multiagents.
Lastly, all of our algorithms can naturally take a prespecified strategy class
$Pi$ as input and output a strategy that is close to the best strategy in
$Pi$. In this setting, the sample complexity only scales with $log Pi$
instead of $sum_{i=1}^mA_i$.  Source:  arXiv, 2206.00159  Services:  Forum  Review  PDF  Favorites 


No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.

 


