Project 2
Fall 1998
Due: November 24
5 points
Using the programming language of your choice, implement the nearest neighbor and naive Bayes learning algorithms. The implementations should construct a classifier from training data and then test the classifier on testing data, printing the classification accuracy and the learning and testing times. For example, if we implemented nearest neighbor in Java, and we want to run it on a poisonous mushroom data set, we might have the following sample run:
% java nn mushroom.train mushroom.test Nearest neighbor Learning time: 0.1 seconds Testing time: 3.0 seconds Classification accuracy: 78.4% %
When you complete the implementations, run each using the following data:
The file mushroom.names provides information about the data set.
Write a brief report that compares and contrasts these two algorithms in terms of the complexity of the implementation, memory requirements to store the learned concepts, learning time, testing time, and classification accuracy.
Turn in a copy of your code, sample runs, and the brief report (~1 page) by the start of class on the due date. This can be a hard copy or an electronic copy submitted on a disk or via email (maloof@cs). If you submit electronically, please use plain ASCII text files for the report and other files.
Note: You may work in groups of two. If you use outside material (i.e., material other than the book and lecture notes), you must cite that material. This applies to other books, papers, web sites, and other people's code.