COSC-288: Machine Learning

Project 4
Fall 2022

Due: W 11/23 @ 11:59 P.M.
9 points

Building upon your previous projects—naturally—implement Flach's primal version of the perceptron as Perceptron and Zurada's multi-layer feed-forward neural network trained with the backpropagation algorithm as BP. For these two learning methods, the train methods should terminate if they have not converged after 50,000 epochs (i.e., one complete pass through all of the training examples) by throwing FailedToConvergeException, which you should derive from RuntimeException. For Perceptron and BP, use the learning rate of 0.9.

For BP, initialize the weights to small random values. Make sure BP uses a different random seed each time it is constructed so there are random restarts. Use a minimum error of 0.1 and the option -J to specify the number of units in the hidden layer including the bias unit.

Implement routines to map the examples of a data set into a homogeneous coordinate system. For data sets consisting of only nominal attributes, use either a binary or a bipolar encoding for the attributes. The perceptron should use a bipolar encoding for the nominal attributes and for the class label. The multi-layer neural network should use a binary encoding for the attributes and a linear encoding of the class attribute. You can assume that an example passed to classify(Example) is in a homogeneous coordinate system (or is augmented). You do not need to worry about data sets with both numeric and nominal attributes.

Here are some new data sets:

As we have discussed in class, training neural networks can be computationally expensive, and it may not converge. Develop your implementation by training and testing on the small data sets, such as bikes and xor, until you are confident that everything seems to be working. I recommend using the hold-out method for larger data sets, such as votes and mushroom. I do not recommend using k-fold cross-validation, although if you're using your own laptop, and you want to convert electricity to heat, then go ahead.

If you're using cs-class-1 for experimentation, please be mindful of other users on the system. If you want to kick off a big training job in the background and go to the Tombs, please be nice and use nice. For example:

cs-class-1$ nice java BP -t cats-and-dogs.mff < /dev/null >| output &
This command runs BP with a nice priority. The fancy redirects prevent ssh from hanging when you log out and write the output of BP to the file named output. The final ampersand puts the job in to the background, where it will run for a long time. At this point, you can log out and head over the Tombs.

When you reconnect to cs-class-1, you can check to see if the job is still running by looking for the name of your executable —in this case, java BP—in the list of active processes:

cs-class-1$ ps -ef | grep BP
maloofm  16205     1 98 15:37 ?        00:00:13 java BP -t cats-and-dogs.mff
maloofm  17920 16238  0 15:45 pts/5    00:00:00 grep BP
You can examine the contents of the output file by typing:
cs-class-1$ more output
If for some reason your implementation of BP seems like it will never terminate, please do not leave it running. To kill a job, look in the list of active processes for the job's ID:
cs-class-1$ ps -ef | grep BP
maloofm  16205     1 98 15:37 ?        00:00:13 java BP -t cats-and-dogs.mff
maloofm  17920 16238  0 15:45 pts/5    00:00:00 grep BP
In this case, it is 16205. Use the kill command to kill the process. It should no longer appear in the process list.
cs-class-1$ kill 16205
cs-class-1$ ps -ef | grep BP
maloofm  19129 16238  0 15:55 pts/5    00:00:00 grep BP

Instructions for Submission

In a file named HONOR, please include the statement:
In accordance with the class policies and Georgetown's Honor System,
I certify that, with the exceptions of the class resources and those
items noted below, I have neither given nor received any assistance
on this project.

Name
NetID
Include this file in your zip file submit.zip.

Submit p4 exactly like you submitted p3. Make sure you remove all debugging output before submitting.

Plan B

If Autolab is down, upload your zip file to Canvas.

Copyright © 2022 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.