Reinforcement Learning and Beyond

A 1-Day Tutorial at AAMAS-10

Tuesday, May 11, 2010

Overview:

Reinforcement Learning (RL) addresses the problem of an agent that must learn how to optimally complete a given task. In its most general formulation, the agent is assumed to have no knowledge on the environment/task. Therefore, it must learn this optimal course of action by direct interaction with the environment that allows the agent to build useful experience for the task at hand. It also receives evaluative feedback from the environment that assesses how well the agent is currently doing. Reinforcement learning addresses problems lying in the intersection of artificial intelligence, machine learning and control theory, relying on and binding several ideas from these topics in an elegant yet rigorous way.

In this one day tutorial, we first provide a brief overview of the fundamental concepts of RL. Second, we discuss current speed-up techniques, such as reward shaping and transfer learning. Third, we move beyond ``classical'' RL and explore the close relation between RL and learning automata (LA) in the context of multiagent systems and evolutionary game-theory, discussing several applications of this framework, including social learning, coordination, and fairness.

Detailed Tutorial Structure:
  • 8:30-10:00: Introduction and Overview
  • 10:00-10:30: Coffee Break
  • 10:30-12:00 Francisco: Fundamental Concepts of RL
  • 12:00-1:30: Lunch
  • 1:30-3:00: Alessandro and Matt: Current RL speed-up techniques
  • 3:00-3:30: Coffee Break
  • 3:30-500: Katja, Peter, and Steven: Learning autonmata, multi-agent systems, and evolutionary game-theory
Presenters:
Picture of F. Melo

Francisco S. Melo is currently a Senior Researcher at the Intelligent Agents and Synthetic Characters Group (GAIPS) of INESC-ID, in Portugal. He completed his PhD in Electrical and Computer Engineering in 2007 at the Instituto Superior Tecnico (IST), in Lisbon, Portugal. In his thesis, he developed and analyzed RL algorithms for cooperative navigation tasks. Prior to his current position with GAIPS/INESC-ID, he held appointments as a Post-doctoral Fellow at the School of Computer science, Carnegie Mellon University (CMU), and as a short-term researcher at the Vision Lab (VisLab), IST, where he worked on the application of machine learning in general (and RL in particular) to developmental robotics.

His current research focuses on theoretical aspects of RL, multiagent systems and developmental robotics. He has published several papers on general aspects of RL (AAMAS, ICML, COLT, ECC), planning and learning in multiagent systems (AAMAS, ICRA) and developmental robotics (ICRA, IROS, AISB). Details about his research interests and publications can be found in his webpage:

http://gaips.inesc-id.pt/~fmelo/

Picture of A. Lazaric

Alessandro Lazaric is a postdoc at the INRIA Lille-Nord Europe research center, working in the SequeL group with Remi Munos. He graduated magna cum laude in computer engineering at the Politecnico di Milano, Italy, in 2004. In 2005, he received a Master of Science at College of Engineering at University of Illinois at Chicago (UIC). In 2005, he began a Ph.D. with support from MIURST (The Italian Ministry for University and Research). He received his doctorate from the Department of Electronics and Information at the Politecnico di Milano, in May 2008 with a thesis on Knowledge Transfer in Reinforcement Learning.

Current research interests include transfer learning, reinforcement learning, and multi-arm bandit problems. Details about his research interests and publications can be found in his webpage:

http://home.dei.polimi.it/lazaric/

Picture of M. Taylor

Matthew E. Taylor is a postdoctoral research associate at the University of Southern California working with Milind Tambe. He graduated magna cum laude with a double major in computer science and physics from Amherst College in 2001. After working for two years as a software developer, he began his Ph.D. with support from the College of Natural Sciences' MCD fellowship. He received his doctorate from the Department of Computer Sciences at the University of Texas at Austin in the summer of 2008 after working under Prof. Peter Stone.

Current research interests include transfer learning, reinforcement learning, human-agent interaction, and multi-agent systems. Details about his research interests and publications can be found in his webpage:

http://teamcore.usc.edu/taylorm

Picture of K. Verbeeck

Katja Verbeeck is a lecturer at the Information Technology Group of the Katholieke Hogeschool Sint-Lieven, Ghent and an associated researcher of the computer science departement at the Katholieke Universiteit Leuven (Belgium). She teaches courses as Intelligent Agents, Distributed Computing and Machine Learning to master level students. Previously, she was a post-doc researcher at the Institute for Knowledge and Agent Technology at the University of Maastricht, the Netherlands. She received the M.S. degree in 1995 and 1997, respectively in mathematics and computer science, and the PhD degree in 2004, all from the Vrije Universiteit Brussel (VUB Belgium). In her dissertation, under supervision of Prof. dr. Ann Nowe from the Computational Modeling Lab at the VUB, she studied the role of exploration in multi-agent reinforcement learning.

Her research interest include Reinforcement Learning in non-stationary environments, Multi-Agent Reinforcement Learning, Learning Automata, (Evolutionary) Game Theory, Multiagent Systems and Ant Algorithms. Details about her research interests and publications can be found in her webpage:

http://allserv.kahosl.be/~katja/

Picture of P. Vrancx

Peter Vrancx received a M.S. degree in computer science from Vrije Universiteit Brussels (VUB), Brussels, Belgium in 2004. He is currently pursuing a Ph.D. degree, funded by a Ph.D grant of the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT Vlaanderen). He is a member of the Computational Modeling Lab, COMO, at VUB.

His research interests are multi-agent learning, learning automata, Markov games and stigmergy. Details about his research interests and publications can be found in his webpage:

http://como.vub.ac.be/~pvrancx

Picture of S. de Jong

Steven de Jong received his M.S. degree in artificial intelligence (AI) from Maastricht University, Maastricht, Netherlands, in 2004. One week after AAMAS 2009, he will defend his Ph.D. thesis entitled “Fairness in Multi-Agent Systems”, which explores why and how multi-agent systems may be enriched with human-inspired fairness mechanisms. He currently is a teacher of Knowledge Engineering at Maastricht University and performs research as a (future) member of the Robotics Institute of Carnegie Mellon University, Pittsburgh, United States of America.

His research interests include Multi-agent systems, Nature-inspired approaches, Evolutionary game theory and Behavioral economics. Details about his research interests and publications can be found in his webpage:

http://www.cs.unimaas.nl/steven.dejong