CSE 6363 Fall 2001
Due: October 9, 2001 (midnight)
- In the directory 6363-501/data/images are 30x30 greyscale (one
byte per pixel) face images of myself and the students (3 images each).
Each file is in pnm format and named after the individual's last name
(e.g., holder1.pnm). Your job is to use the BP program to train a
network to recognize the faces in the class. The BP program, documentation
and examples are in the directory 6363-501/code/bp.
- Your network will have 900 inputs, some number of hidden units, and 9
outputs. Your 9 outputs can be either 0 or 1 according to which person is
recognized. Use the order: Chousein, Gee, Holder, Islam, Kukluk, Papudesi,
Rao, Roof, Thompson. For example, the target output pattern for Holder
would be (0,0,1,0,0,0,0,0,0). Or, if you decide to follow the advice of
the book (see Section 4.7), you may want to use output patterns like
(0.1,0.1,0.9,0.1,0.1,0.1,0.1,0.1,0.1) to avoid arbitrarily large weights
since the output units may be unable to attain exactly 0 or 1.
- You will need to convert the pnm files into input files for the BP
program. The pnm file is in the following format:
# ... (comment)
The first line ``P2'' specifies the file is in pnm ascii format. Any line
beginning with a ``#'' is a comment. The second uncommented line contains
the dimensions of the image, and the third line specifies the maximum of
the pixel value range (0-255). The remaining lines contain the pixel
values. The first value is the pixel in the top-left corner of the image.
The next value is the next pixel in the top row of the image, and so on.
Per the book's suggestion (see Section 4.7) you may want to normalize
the input values to between 0 and 1.
- Train your network on the 18 images ending in 1 or 2 and test the
network on ALL 27 images. Try different parameters and topologies to
minimize testing error.
- Turn in the network file of your best network, and any other
information necessary for me to reproduce your best result. Also, discuss
your experience with this task (e.g., what worked, what didn't work, and
the effectiveness of neural nets on this task).
- For each of the six datasets in the 6363-501/data directory,
use the ml program to run a 10-fold cross validation on the Bayes,
C4.5 and BP algorithms. A file-based interface to the BP program has been
provided in 6363-501/code/ml2.0/bp.c. You are encouraged to modify
the interface to BP in order to improve performance. Code for a naive
Bayes classifier is provided in 6363-501/code/ml2.0/bayes.c. Compile
your results into three tables each in the form shown below. For example,
the BP-C4.5 column is the average and standard deviation of the difference
between the 10 runs of the two algorithms. Also include the significance
levels of the differences and the overall ANOVA significance for all three
||BP - C4.5
||0.33 +/- 0.05
||0.22 +/- 0.07
||0.11 +/- 0.06
||. . .
- Compare the different algorithms based on your tabulated results
(i.e., which algorithm seems best). Describe any modifications to the BP