Due: September 4, 2009 (midnight)
For this assignment you will familiarize yourself with the WEKA Machine
Learning Software, which we will use throughout the course for testing
various learning algorithms.
- Download and install WEKA on your preferred platform. WEKA is
available here. Be
sure to get the latest stable version (3.6.1). WEKA is already installed
on the machines in Sloan 353.
- Run the ConjunctiveRule classifier on each of the 10 datasets supplied
with WEKA and collect the output of the runs.
- Run WEKA and choose Applications->Explorer.
- Under Preprocess tab select "Open file..." to select a database from
the data directory.
- Under Classify tab select "Choose" to select the ConjunctiveRule
- Select "Use training set" for the "Test options".
- Click "Start" to run and retain the output for use below.
- Find a dataset of interest to you (other than those that come with
WEKA), convert it to WEKA's ARFF format, and run the ConjunctiveRule classifier
on it as described above. See the data repository links under Course Resources
on the main course web page for some sources of data.
- Prepare one table showing the following information for each of the 11
- Number of training instances
- Number of attributes
- Root mean squared error on the training set
- Email to me (email@example.com)
a ZIP file containing the following.
- Nicely-formatted document (MSWord or PDF) showing the raw
output from each of the 11 runs, the table, and a brief description (in your
own words) of the dataset you obtained, including where I can find it.
- A file containing your dataset in ARFF format.