Machine Learning

Homework 3

Due: March 2, 2004 (midnight)

For this assignment you will use WEKA to perform a statistical comparison of several learning algorithms using several datasets.

  1. Using WEKA's experimenter environment, perform the following experiment.
    1. For the Results Destination section, select ARFF file and provide a file name in which to store the experimental results.
    2. For Experiment Type, choose 10-fold cross-validation and classification.
    3. For Iteration Control, choose the default settings: 10 iterations and data sets first.
    4. Select the following five datasets that come with WEKA: contact-lenses, iris, labor, soybean and weather.
    5. Select the following classifiers with default parameter settings: ConjunctiveRule, NaiveBayes and J48.
    6. Run the experiment.
    7. Analyze the results by loading the ARFF results file, select "Percent_incorrect" as the comparison field, set the significance level to 0.05, select ConjunctiveRule as the test base, check to show standard deviations, and perform the test.
  2. Construct a table of classifiers vs. datasets, and in each entry, enter the error and standard deviation of that classifier on that dataset from the above experiment. Also, add an asterisk to the end of the entry for each dataset/classifier pair for which the classifier outperforms ConjunctiveRule at the 0.05 level.
  3. For the above experiment, what is the lowest significance level (to the nearest 10th) at which the NaiveBayes classifier significantly outperforms the J48 classifier on the soybean dataset?
  4. Next, we will use WEKA to generate ROC curves for NaiveBayes and J48 on the labor dataset. First, we need to generate and save ROC curve data.
    1. Using the WEKA Explorer open the labor dataset under the Preprocess tab.
    2. Under the Classify tab, choose the NaiveBayes classifier and click Start to perform a 10-fold cross-validation test.
    3. In the Result list window, right-click on the NaiveBayes entry and choose Visualize Threshold Curve and class "good". The visualization window will appear.
    4. Change the X axis to be False Positive Rate, and change the Y axis to be True Positive Rate. You should now see the ROC curve.
    5. Click Save and store the results to a file in ARFF format.
    6. Exit the visualization window and repeat the above for the J48 classifier with default settings.
    Now, we need to load the data into Excel (or some other charting software) to visualize the ROC curves for both classifiers at once. Here's an outline of the process for Excel.
    1. Edit the two ARFF files containing the threshold curve results saved above and remove everything above and including the "@data" line. Note that the False Positive Rate and True Positive Rate values are the sixth and seventh entries, respectively, in each line.
    2. Open Excel and choose Data -> Import External Data -> Import Data. Browse to the first ARFF file and load it as a comma-delimited file. Do the same for the second ARFF file.
    3. Insert a chart of type scatter line plot and put two lines on the plot: one is TP vs. FP for NaiveBayes, and one is TP vs. FP for J48.
    4. This chart will now show the two ROC curves for NaiveBayes and J48 on the labor dataset.
    5. Nicely format your chart with a title, correct axis titles, correct legend titles, and proper ranges on X and Y axes.
  5. Discuss your conclusions about the performance of NaiveBayes and J48 on the labor dataset based on the appearance of the ROC curves.
  6. Email to me (holder@cse.uta.edu) your nicely-formatted report (MSWord, PDF or PostScript) containing the information referred to above. Specifically,
    1. Raw output of the first experiment above (result from 1g).
    2. Table summarizing results of first experiment.
    3. Minimum significance level at which NaiveBayes outperforms J48 on the soybean dataset.
    4. Raw threshold curve data for NaiveBayes and J48 on the labor dataset (the two files you saved in step 4e above).
    5. Nicely-formatted plot of the two ROC curves.
    6. Discussion of performance comparison based on the ROC curves.