Machine Learning

Homework 2

Due: February 17, 2004 (midnight)

For this assignment you will use WEKA to learn about and compare decision-tree induction and neural network classifiers.

  1. Run the J48 decision-tree classifier on the weather.arff dataset. Use the default parameter settings for J48, and use the training set as the test option. Include in your report the printed results (tree and statistics) from WEKA and draw graphically the decision tree classifier learned by J48.
  2. WEKA's default parameter settings for J48 are -C 0.25 -M 2. Explain in your own words what these mean.
  3. Run the NeuralNetwork classifier on the weather.arff dataset. Use the default parameter settings, and use the training set as the test option. Include in your report the printed results (weights and statistics) from WEKA and draw graphically the neural network topology (input nodes, hidden nodes, output nodes, connections) used by the classifier. Do not show the weights on your drawing.
  4. WEKA's default parameter settings (among others) for NeuralNetwork are -L 0.3 -M 0.2 -N 500 -H a. Explain in your own words what these mean.
  5. Run the ConjunctiveRule, J48 and NeuralNetwork classifiers on the 7 databases supplied with WEKA and the 8th database you collected for Homework 1. For each run, use the Percentage Split test option with 66% training. Include in your report a table giving the error rate on the test set for each classifier for each database. Note for databases with numeric class values, J48 will not run, and for the ConjunctiveRule and NeuralNetwork classifiers, report the Relative Absolute Error statistic as the error rate.
  6. Compare the performance of the three classifiers. Specifically, which classifiers perform better on which databases and why.
  7. Email to me (holder@cse.uta.edu) your nicely-formatted report (MSWord, PDF or PostScript) containing the information referred to above.