| | |
| | |
Stat |
Members: 3643 Articles: 2'487'895 Articles rated: 2609
28 March 2024 |
|
| | | |
|
Article overview
| |
|
AIXIjs: A Software Demo for General Reinforcement Learning | John Aslanides
; | Date: |
22 May 2017 | Abstract: | Reinforcement learning is a general and powerful framework with which to
study and implement artificial intelligence. Recent advances in deep learning
have enabled RL algorithms to achieve impressive performance in restricted
domains such as playing Atari video games (Mnih et al., 2015) and, recently,
the board game Go (Silver et al., 2016). However, we are still far from
constructing a generally intelligent agent. Many of the obstacles and open
questions are conceptual: What does it mean to be intelligent? How does one
explore and learn optimally in general, unknown environments? What, in fact,
does it mean to be optimal in the general sense? The universal Bayesian agent
AIXI (Hutter, 2005) is a model of a maximally intelligent agent, and plays a
central role in the sub-field of general reinforcement learning (GRL).
Recently, AIXI has been shown to be flawed in important ways; it doesn’t
explore enough to be asymptotically optimal (Orseau, 2010), and it can perform
poorly with certain priors (Leike and Hutter, 2015). Several variants of AIXI
have been proposed to attempt to address these shortfalls: among them are
entropy-seeking agents (Orseau, 2011), knowledge-seeking agents (Orseau et al.,
2013), Bayes with bursts of exploration (Lattimore, 2013), MDL agents (Leike,
2016a), Thompson sampling (Leike et al., 2016), and optimism (Sunehag and
Hutter, 2015). We present AIXIjs, a JavaScript implementation of these GRL
agents. This implementation is accompanied by a framework for running
experiments against various environments, similar to OpenAI Gym (Brockman et
al., 2016), and a suite of interactive demos that explore different properties
of the agents, similar to REINFORCEjs (Karpathy, 2015). We use AIXIjs to
present numerous experiments illustrating fundamental properties of, and
differences between, these agents. | Source: | arXiv, 1705.7615 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser claudebot
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |