The Amazing Race Repeated Update Q-Learning VS. Q-Learning
Keywords:
Q-learning, Repeated Update Q-learning algorithm, Markovian decision processes, simulationAbstract
In this paper, we will conduct an experiment that aims to compare the performance of two reinforcement learning algorithms, the Repeated Update Q-learning algorithm (RUQL) [1] and the Q-learning algorithm(QL) [5]. A simulated version of a robot crawler developed by [6] will be used in this experiment, it is shown in figure (1). An investigation study about the difference in performance between RUQL and Q-learning algorithm (QL) [5] is discussed in this paper. Several trials and tests were conducted to estimate the difference in the crawler’s movement using both algorithms. Additionally, a detailed description of the Markovian decision processes (MDPs) elements [2] is introduced, MDP model includes states, actions and rewards for the task in hand. The parameters that were used and tuned in this experiment will be mentioned and the reasons for choosing their values will be explained. Finally, the source code for the crawler robot was modified in order to implement RUQL and Q-Learning (QL) algorithms, Eclipse [3] and Java SE Development Kit 8 (JDK) [4] are used for this purpose. After running the crawler robot simulation, the results drawn from the experiment showed that RUQL significantly outperforms the traditional QL.
References
. Abdallah, S. and Kaisers, M. (2013). Addressing the Policy-bias of Q-learning by Repeating Updates. pp.1045--1052.
. Bellman, R. (1957). A Markovian decision process.
. Guindon, C. (2016). [online] Eclipse.org. Available at: http://eclipse.org [Accessed 01 June. 2016].
. Oracle.com, (2016). Oracle | Hardware and Software, Engineered to Work Together. [online] Available at: http://oracle.com [Accessed 01 June 2016].
. Watkins, C. and Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), pp.279--292.
. Berghen, F. (2016). Kranf site: research. [online] Applied-mathematics.net. Available at: http://www.applied-mathematics.net [Accessed 01 June 2016].
. Applied-mathematics.net, (2016). [online] Available at: http://www.applied-mathematics.net/qlearning/BotQLearning.java [Accessed 01 June 2016].
. Tokic, M., Ertel, W. and Fessler, J. (2009). The Crawler, A Class Room Demonstrator for Reinforcement Learning.
. Wikipedia.com, (2016). Retrieved 9 June 2016, from http://en.wikipedia.org/wiki/Eclipse_(software)
. Youtube.com, (2016). YouTube. [online] Available at: http://youtube.com [Accessed 01 June. 2016].
. Botvinick, M. (2012). Hierarchical reinforcement learning and decision making. Current Opinion In Neurobiology, 22(6), 956-962. http://dx.doi.org/10.1016/j.conb.2012.05.008.
Downloads
Published
How to Cite
Issue
Section
License
Authors who submit papers with this journal agree to the following terms.