Due: October 2, 2009 (midnight)
No late homeworks will be accepted.
For this assignment you will learn how to perform a statistical comparison
of learning algorithms both by hand and using WEKA. You will also use WEKA
to learn about the neural network classifier (called MultilayerPerceptron).
- Consider the following error rates made by the hypotheses learned by two different
learning algorithms L1 and L2 using six trials, where each trial used the same training
and testing sets for both learners (i.e., a paired test). Determine the level of
confidence (as in Table 5.6) we have that L1 outperforms L2. Show all your work.
|Trial ||L1 ||L2 |
|1 ||0.363 ||0.300 |
|2 ||0.363 ||0.300 |
|3 ||0.263 ||0.200 |
|4 ||0.137 ||0.400 |
|5 ||0.037 ||0.300 |
|6 ||0.037 ||0.300 |
- Run the MultilayerPerceptron classifier on the weather.arff dataset. Use
the default parameter settings, and use the training set as the test
option. Include in your submission the printed results (weights and
statistics) from WEKA. In your report draw graphically the neural network
topology (input nodes, hidden nodes, output nodes, connections) used by the
classifier. Do not show the weights on your drawing.
- WEKA's default parameter settings (among others) for MultilayerPerceptron
are -L 0.3 -M 0.2 -N 500 -H a. Explain in your own words what these
- Using WEKA's experimenter application, perform the following
- Choose a "New" experiment.
- For the Results Destination section, select ARFF file and provide a
file name in which to store the experimental results.
- For Experiment Type, choose the default settings: cross-validation
with 10 folds and classification.
- For Iteration Control, choose the default settings: 10 iterations and
data sets first.
- Select the following four datasets that come with WEKA:
contact-lenses, iris, labor, and weather.
- Select the following classifiers with default parameter settings:
ConjunctiveRule, J48 and MultilayerPerceptron.
- Run the experiment.
- Analyze the results by loading the ARFF results file, selecting the
following configuration, and perform the test.
- Testing with: Paired T-Tester (corrected)
- Comparison field: Percent_incorrect (NOTE: "incorrect", not "correct")
- Significance: 0.05
- Test base: rules.ConjunctiveRule
- Show std. deviations: (checked)
- Construct a table of classifiers vs. datasets, and in each entry,
enter the error and standard deviation of that classifier on that dataset
from the above experiment. Also, add an asterisk to the end of the entry
for each dataset/classifier pair for which the classifier outperforms
ConjunctiveRule at the 0.05 level.
- Next, we will use WEKA to generate ROC curves for J48 and
MultilayerPerceptron on the labor dataset. First, we need to
generate and save ROC curve data.
Now, we need to load the data into Excel (or some other charting software)
to visualize the ROC curves for both classifiers at once. Here's an
outline of the process for Excel.
- Using the WEKA Explorer open the labor dataset under the Preprocess
- Under the Classify tab, choose the J48 classifier and click
Start to perform a 10-fold cross-validation test.
- In the Result list window, right-click on the J48 entry and
choose Visualize Threshold Curve and class "good". The visualization window
- Verify the X axis to be False Positive Rate, and the Y axis to
be True Positive Rate. You should now see the ROC curve.
- Click Save and store the results to a file in ARFF format.
- Exit the visualization window and repeat the above for the
MultilayerPerceptron classifier with default settings.
- Edit the two ARFF files containing the threshold curve results saved
above and remove everything above and including the "@data" line. Note that
the False Positive Rate and True Positive Rate values are the sixth and
seventh entries, respectively, in each line.
- Open Excel and choose Data -> Import External Data -> Import
Data. Browse to the first ARFF file and load it as a comma-delimited
file. Do the same for the second ARFF file.
- Insert a chart of type scatter line plot and put two lines on the
plot: one is TP vs. FP for J48, and one is TP vs. FP for MultilayerPerceptron.
- This chart will now show the two ROC curves for J48 and
MultilayerPerceptron on the labor dataset.
- Nicely format your chart with a title, correct axis titles, correct
legend titles, and proper ranges on X and Y axes.
- Discuss your conclusions about the performance of J48 vs.
MultilayerPerceptron on the labor dataset based on the appearance
of the ROC curves.
- Email to me (email@example.com)
a zip file containing the following:
- Text file containing the raw output of the MultilayerPerceptron
run on the weather dataset.
- Text file containing the raw output of the first experiment above
(result from 4h).
- Raw threshold curve data for J48 and MultilayerPerceptron on the
labor dataset (the two files you saved in step 6e above).
- Nicely-formatted report (MSWord or PDF) containing:
- Analysis from question 1.
- Drawing of network used in question 2.
- Description of parameters from question 3.
- Table summarizing results of first experiment (question 5).
- Nicely-formatted plot of the two ROC curves (question 6).
- Discussion of performance comparison based on the ROC curves (question 7).