COSC 388: Machine Learning

Project 3
Fall 2003

Due: Oct 27 @ 5 P.M.
10 points

Implement ID3 (i.e., symbolic attributes, information gain for attribute selection, no pruning). Modify your implementation of k-NN or naive Bayes from project two so it will work for symbolic attributes. Conduct an evaluation of these two methods using the 1984 Congressional Voting Record: votes.names and votes.data. Compute average measures and measures of dispersion, but don't worry about running a statistical test.

All implementations must be general, meaning they must work with any data set with symbolic attributes.

Each algorithm must be applied to the same training and testing sets, the examples of which must be selected randomly.

Everything must compile and run on gusun, cssun, or daruma.

Include in the archive everything needed to compile, run the programs, and reproduce the results.

In a text file named README, include the results of the evaluation and instructions about how to execute your program and reproduce the results.

Instructions for Submission: In the header comments, provide the following information:

//
// Name
// E-mail Address
// Platform: Windows, OS X, Redhat, Solaris (cssun/gusun/daruma), etc.
// Development Environment: gcc, g++, java, g77, etc.
// Mail Client: mailx, pine, GUMail, Netscape, Yahoo!, etc.
//
When you are ready to submit your project for grading, create a compressed archive of a directory containing the files of project and send it to me by e-mail as an attachment, as you did for Project 1.

Submit your project before 5:00 P.M. on the due date.

Once submitted, it is important to keep an electronic copy of your project on either cssun or gusun. These systems are regularly backed-up, and if we lose your project or the e-mail system breaks, then we will need to look at the modification date and time of your project to ensure that you submitted it before it was due. If you developed your code on a Windows machine, then use a secure ftp client to transfer your files or the archive to cssun or gusun.

Finally, when storing source code on university machines, it is important to set file permissions so others cannot read the file. To turn off such read/write permissions, type at the UNIX prompt chmod og-rw <file>, where <file> is the name of your source file.