The Mario Project

Created by Matthew E. Taylor and published as a Model Assignment by the 2011 EAAI symposium.
You may wish to check Matt's webpage for the latest version of the project.
This is version 1.0.

I have tried to make everything self-explanatory, but I a happy to answer questions at Also, if you use this project, I would be very interested in any feedback you're able to provide!

Summary This assignment is appropriate for a course with a reinforcement learning component (i.e., Introductory AI or Machine Learning). In it, students are asked to implement, test, and evaluate multiple reinforcement learning algorithms within the Generalized Mario domain. An analysis of the student outcomes from this assignment is available as an EAAI-11 paper.
Topics Reinforcement Learning, Machine Learning
Audience Third and fourth year undergraduates, or beginning graduate students
Difficulty Upper level undergraduates who are majoring in computer science should be able to do well on this assignment given sufficient time. None of the concepts used are particularly difficult, assuming they are also discussed during class lecture, but take time to master and implement. I suggest that the project be scheduled for 3-4 weeks.
Strengths Students found the project quite engaging --- they clearly had fun working in the video game task. Steps 7 and up may be completed in any order, or omitted entirely.
Weaknesses Students may have trouble getting the initial implementation of Sarsa working (step 5). Further steps will be more difficult for the students without a working Sarsa implementation for comparison.
Dependencies Students need the ability to write "non-trivial" programs. This project can be used early in a semester-long machine learning course if few of the extensions are used, or later in the course if the extensions are emphasized. Java is recommended, but Python, C, C++, Lisp, Matlab, and Python are supported in by the framework.
Variants Others could make this assignment easier by providing more of the framework to students, e.g., defining a state representation. There are many possible extensions to this work. For instance, my students also examined 1) Meta Learning, 2) Function Approximation, 3) Hierarchical Learning, and 4) User Shaping Reward as part of a follow-up project in which students picked their own topic.
The Mario domain is discussed in more detail here. The code to run Mario can be downloaded here, which also has instructions to install the software and run a demo agent. The code should work on Linux, Mac, and Windows. (Local backups of the domain description and install software are included in case the website becomes unavailable.)
The handout, with a full description of the project, can be found here.