A simple metric learning algorithm for reinforcement learning domains
Let's suppose that you have an agent acting in a 2D environment (e.g., Mountain Car. That agent can print out data while it's acting in the environment in state, action, next state form.
Given this data, we can reformat it a bit using some scripting, and then run it through MATLAB with an algorithm called HOLLER. The result will be a learned Mahalanobis distance that can significantly improve an agent's ability to learn without relying on manually scaled state variables.
Here is an example with the HOLLER code. After extraction,
will parse and analyze the test.dat data file generated by mountain car and return a 2x2 metric. This distance metric can be directly used to calculate similarities between states (i.e., scale the state space).
The code is not particularly well documented -- after reading the paper, if you'd like to get more details about the code or have trouble getting the code to work with higher dimentional problems, feel free to write Matt.