Due: September 14, 2007 (midnight)
No late homeworks will be accepted.
For this assignment you will learn about the decision-tree induction
classifier and compare it to the ConjunctiveRule classifier.
- Exercise 3.2, page 77 of Mitchell's book. Also compute the information
gain of a1 relative to the training examples, and indicate
which of the two attributes is the better choice for an attribute to split
on in a decision-tree learning algorithm like ID3.
- Run the J48 decision-tree classifier on the weather.arff dataset. Use
the default parameter settings for J48, and use the training set as the test
option. Include in your report the printed results (tree and statistics)
from WEKA and draw graphically the decision tree classifier learned by J48.
- WEKA's default parameter settings for J48 are -C 0.25 -M
2. Explain in your own words what these mean.
- Run the ConjunctiveRule and J48 classifiers on the 10
databases supplied with WEKA and the extra database you collected for
Homework 1. For each run, use the Percentage Split test option with 66%
training. Include in your report a table giving the mean absolute error
on the test set for both classifiers on each database. Note that for
databases with numeric class values, J48 will not run.
- Compare the performance of the two classifiers. Specifically, which
classifier performs better on which databases and why.
- Email to me (firstname.lastname@example.org)
your nicely-formatted report (MSWord, PDF or PostScript) containing the
information referred to above.