• Sorted by Date • Classified by Publication Type • Sorted by First Author Last Name • Classified by Research Category •
Matthew E. Taylor, Shimon Whiteson, and Peter
Stone. Comparing Evolutionary and Temporal Difference Methods for Reinforcement Learning. In Proceedings of
the Genetic and Evolutionary Computation Conference (GECCO), pp. 1321–28, July 2006. 46% acceptance rate, Best
Paper Award in GA track (of 85 submissions)
Best Paper
Award (Genetic Algorithms Track) at GECCO-2006.
Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical comparisons have been conducted, there are no general guidelines describing the methods' relative strengths and weaknesses. This paper presents the results of a detailed empirical comparison between a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. In particular, we compare the performance of NEAT \citestanley:ec02evolving, a GA that evolves neural networks, with Sarsa \citeRummery94,Singh96, a popular TD method. The results demonstrate that NEAT can learn better policies in this task, though it requires more evaluations to do so. Additional experiments in two variations of Keepaway demonstrate that Sarsa learns better policies when the task is fully observable and NEAT learns faster when the task is deterministic. Together, these results help isolate the factors critical to the performance of each method and yield insights into their general strengths and weaknesses.
@InProceedings{GECCO06-taylor, author="Matthew E. Taylor and Shimon Whiteson and Peter Stone", title="Comparing Evolutionary and Temporal Difference Methods for Reinforcement Learning", booktitle="Proceedings of the Genetic and Evolutionary Computation Conference ({GECCO})", month="July",year="2006", pages="1321--28", abstract={ Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical comparisons have been conducted, there are no general guidelines describing the methods' relative strengths and weaknesses. This paper presents the results of a detailed empirical comparison between a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. In particular, we compare the performance of NEAT~\cite{stanley:ec02evolving}, a GA that evolves neural networks, with Sarsa~\cite{Rummery94,Singh96}, a popular TD method. The results demonstrate that NEAT can learn better policies in this task, though it requires more evaluations to do so. Additional experiments in two variations of Keepaway demonstrate that Sarsa learns better policies when the task is fully observable and NEAT learns faster when the task is deterministic. Together, these results help isolate the factors critical to the performance of each method and yield insights into their general strengths and weaknesses. }, note = {46% acceptance rate, {\textbf{Best Paper Award}} in GA track (of 85 submissions)}, wwwnote={<span align="left" style="color: red; font-weight: bold">Best Paper Award</span> (Genetic Algorithms Track) at <a href="http://www.sigevo.org/gecco-2006/">GECCO-2006</a>.}, }
Generated by bib2html.pl (written by Patrick Riley ) on Thu Jul 24, 2014 16:09:10